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Our  experiments  to  determine  the  utility  of  color  provide  results  which  establish 
an  empirical  baais  for  determining  general  principles  about  the  role  of  color  in  pattern 
recognition.  We  have  found  that  color  may  be  irrelevant  under  conditions  that  represent 
global  stimulus  processing  and  when  the  shape  and  color  dimensions  are  handled  as 
separable.  However,  our  results  also  indicate  that  color  is  'nvolved  in  the  shape 
recognition  process  when  the  separate  dimensions  of  color  and  shape  are  Integral  or  when 
the  stimulus  is  locally  processed.  Thus,  our  experiments  have  shown  that  a single  answer 
concerning  the  utility  of  color  In  pattern  recognition  is  not  to  be  expected. 

The  measured  power  spectra  for  both  luminance  and  chrominance  pictorial  information 
were  found  to  roll  off  at  high  spatial  frequencies  as  the  inverse  square  of  rhe  spatial 
frequency.  We  feel  that  these  results  support  the  hypotnesis  chat  edge  transitions 
represent  a significant  feature  of  both  the  luminance  and  chrominance  content  of 
pictorial  information.  Unlike  the  luminance  power  spectra,  however,  which  must  have 
a maximum  value  at  dc,  Che  chrominance  power  spectra  may  have  a value  close  to  zero 
at  uc.  Also,  our  measurements  on  the  rms  modulation  depths  for  pictorial  scenes  have 
, shown  that  the  average  luminance  fluctuations  are  large,  whereas  the  average  chrominance 
^Variations  are  relatively  small. 

In  this  report  we  present  the  results  of  a series  of  experiments  that  established 
the  relationship  between  our  sharpness  descriptor,  the  visual  capacity,  and  the  subjec- 
tive sharpness  of  displayed  images.  The  just-noticeable  difference  (jnd)  in  image 
sharpness  was  measured  as  a function  of  display  bandwidth.  At  high  spatial  fre- 
quencies these  measurements  were  found  to  be  in  excellent  agreement  with  the  asser- 
tion that  display  sharpness  is  mediated  by  the  perceived  rms  gradient  content  of  m 
image;  a quantity  proportional  to  the  square  root  of  the  visual  capacity.  The  addition 
of  chrominance  information  to  a black-and-white  image  was  not  found  to  appreciably  affect 
the  perception  of  image  sharpness.  Finally,  the  measured  results  on  the  subjective 
sharpness  fo.  both  pictorial  Images  and  single-transition  luminance  edges  proved  to  be 
indistinguishable.  We  feel  that  this  result  supports  the  contention  that  edge  transi- 
tions are  important  in  determining  Image  sharpness  in  pictorial  scenes.  ^ 

In  this  report  a descriptor  for  the  total  channel  capacity  of  the  d^sjllay-observer 
system  is  developed  that  Includes  the  statistical  properties  of  both  lumiLance  and 
chrominance  information.  This  descriptor  is  based  on  a widely  accepted  model  of  the 
visual  system  that  contains  three  independent  channels:  one  channel  that  transmits 

the  luminance  information,  and  two  channels  that  transmit  the  chrominance  information. 

For  the  first  time  the  nonlinearicies  associated  with  luminance  perception  are  'Included 
in  our  model  through  a novel  Interpretation  of  recent  psychophysical  experiments.  The 
model  for  the  total  information  capacity  is  used  to  predict  the  optimum  allocation  for 
a coamercial  television  system;  the  results  are  shown  to  be  in  good  agreement  with\ 
current  U.S.  television  practice. 
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SECTION  I 


INTRODUCTION 


All  practical  display  systems  are  concerned  with  the  efficient  transfer 
of  information  to  the  human  observer.  This  objective  requires  a detailed 
understanding  of  the  perceptual  capabilities  of  the  observer,  the  performance 
characteristics  of  the  display,  the  properties  of  the  information  to  be  trans- 
mitted, and  the  observer  response  intended  from  this  information.  Furthermore, 
these  separate  elements  must  be  fused  together  in  a manner  that  will  result  in 
the  optimum  display  design  for  a given  task. 

Historically,  displays  have  been  developed  empirically  to  maximize,  under 
specific  design  constraints,  the  performance  of  the  display.  Although  an 
understanding  of  both  the  relevant  visual  processes  and  the  salient  features 
of  the  information  to  be  displayed  were  important  in  the  overall  conception 
of  the  display,  there  was  no  systematic  methodology  available  to  use  in  speci- 
fying  detailed  design  requirements.  Typically,  design  refinements  were 
resolved  by  a process  of  iteration  to  achieve  the  best  overall  compromise  in 
display  performance.  This  approach,  exemplified  by  commercial  television, 
can  be  impressively  successful.  Thus  today,  almost  25  years  after  the  adoption 
of  U.S.  comnercial  television  standards,  satisfaction  with  them  is  still 
nearly  universal.  However,  although  the  empirical  approach  has  resulted  in  a 
wealth  of  information  about  display  design  and  certain  aspects  of  human  vision, 
it  does  not  readily  lend  itself  to  generalization.  In  effect,  each  display 
must  be  optimized  individually,  a process  that  is  both  time  consuming  and  ex- 
pensive . 

In  this  report,  and  in  two  previous  reports  [1,2]  (hereafter  referred  to 
as  TR1  and  TR2,  respectively),  we  have  presented  a family  of  mathematical  des- 
criptors that  we  have  developed  for  the  quantitative  evaluation  of  the  perfor- 
mance of  displays  viewed  by  human  observers.  This  work  follows  the  research 
approach  of  Rose  [3],  Schade  [4],  Biberman  [5],  and  others  [6-8]  for  advancing 
the  understanding  of  displayed  information  by  the  development  of  models  of  the 
display-observer  system. 
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This  report  is  organized  into  a discussion  of  two  parallel  courses  of 
research.  In  Section  II,  we  present  the  results  of  a series  of  experiments 
designed  to  determine  the  general  principles  underlying  the  potential  utility 
of  color  as  an  aid  in  the  recognition  of  shapes  in  various  contexts.  In  Sec- 
tions III  through  V,  we  have  continued  to  expand  the  development  of  the  mathe- 
matical description  of  the  display-observer  system  presented  in  TR1  and  TR2. 

Broadly  speaking,  the  image  descriptors  we  have  developed  are  concerned 
with  issues  of  information  visibility.  l'he  various  descriptor  models  address 
different  spatial  attributes  whose  visibility  should  be  maximized  for  optimum 
display  design.  There  remains  the  question,  however,  as  to  how  the  visible 
information  is  organized  and  handled  by  the  observer  to  meet  some  performance 
goal.  For  example,  the  role  of  color  in  the  recognition  of  visible  patterns 
is  beyond  the  scope  of  our  descriptor  formulation. 

Rather  than  treat  the  recognition  issue  as  a byproduct  of  the  descrip- 
tors, we  have,  therefore,  addressed  it  directly.  Our  approach  is  to  provide 
an  empirical  basis  for  pointing  out  some  general  principles  about  the  role  of 
color  in  recognition.  In  Section  II,  experiments  are  presented  that  address 
three  issues:  (1)  the  potential  for  color  as  an  aid  to  recognition  when  the 

primary  attention  is  given  to  stimulus  shape,  and  the  questions  of  (2)  whether 
demonstrated  performance  advantage  due  to  color  coding  is  a consequence  of 
sensory  or  of  cognitive  issues,  and  (3)  whether  color  involvement  is  demon- 
strable in  steadily  viewed  displays  where  speed— stress  is  not  a factor. 

Our  results  indicate  that  color  is  involved  in  the  shape  recognition 
process  when  the  separate  dimensions  of  color  and  shape  are  integral  or  when 
the  stimulus  is  locally  processed.  Integral  dimension  processing  is  suggested 
for  alphanumeric  recognition  under  brief  exposure  stimulation;  local  processing 
is  suggested  for  the  detection  of  large  shape  subtending  1°  of  visual  angle 
steady  viewing.  On  the  other  hand,  color  can  be  irrelevant  under  conditions 
that  we  feel  represent  global  stimulus  processing,  and  when  the  shape  and  color 
dimensions  are  handled  as  separable. 

Whereas  separable-dimensions  processing  is  suggested  for  simple-shape 
recognition  under  brief  exposure  stimulation,  global  processing  is  suggested 
for  the  detection  of  small  shapes  (subtending  1/2P  of  visual  angle)  under  steady 
viewing.  Thus,  our  experiments  show  that  a single  answer  concerning  the  utility 
of  color  in  pattern  recognition  is  not  to  be  expected. 
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Our  approach  In  the  development  of  display  descriptors  has  been  to  use 
the  tools  of  statistical  information  theory.  That  is,  we  treat  the  displayed 
information,  the  effects  of  sampling,  and  the  noise  sources  (both  display 
and  visual  noise)  as  statistically  averaged  properties  of  the  display-observer 
system.  This  approach  has  the  important  advantage  of  allowing  one  to  deter- 
mine the  most  likely  behavior  of  the  display  for  a given  set  of  conditions. 
Further,  as  we  show  in  Section  III  for  pictorial  scenes,  the  conclusions 
derived  from  statistical  estimates  may  be  expected  to  have  wide  applicability 
because  the  variations  in  these  values  are  relatively  small. 

For  the  application  of  our  display  descriptors  to  actual  displays,  the 
statistical  properties  of  the  information  to  be  presented  on  the  display  must 
be  known.  In  TR1  we  measured  the  ensemble-averaged  power  spectral  density 
for  luminance  information,  as  represented  by  off-the-air  commercial  television. 
In  that  study  we  found  that  the  power  spectral  density,  for  frequencies  above 
a low-frequency  cutoff,  decreases  as  the  square  of  the  Inverse  spatial  fre- 
quency. That  is,  the  luminance  power  spectral  density  for  actual  scenes  was 
identical  in  form  to  the  power  spectral  density  for  randomly  located  lumi- 
nance-edge transitions.  As  explained  in  TR1,  this  result  supports  the 
hypothesis  that  edge  transitions  represent  a significant  feature  in  natural 
scenes.  In  Section  III  of  this  report  we  have  extended  our  luminance  power 
spectral  density  measurements  to  show  that  the  inverse  square  frequency  roll- 
off described  above  is  not  affected  by  the  finite  angular  extent  of  the 
scenes  studied.  In  Section  III  we  also  present  the  first  results  of  measure- 
ments of  the  chrominance  power  spectral  densities  for  pictorial  scenes.  As 
in  the  case  of  the  luminance  power  spectra,  the  chrominance  power  spectra 
were  found  to  roll  off  at  high  spatial  frequencies  as  the  inverse  square 
of  the  spatial  frequency.  However,  unlike  the  luminance  power  spectra, 
which  must  have  their  maximum  value  at  dc,  it  was  found  that  the  chrominance 
power  spectra  can  have  a value  close  to  zero  at  dc.  Finally,  results  are 
presented  tor  the  rms  modulation  depths  for  both  the  luminance  and  chrominance 
information  in  pictorial  scenes.  These  results  show  that  while  the  average 
luminance  fluctuations  in  pictorial  scenes  are  highly  modulated,  the  average 
chrominance  variations  are  relatively  small. 


An  Important  aspect  of  our  program  has  been  the  verification  of  our 
mathematical  display  descriptors.  In  Section  IV  we  present  the  results  of 
a series  of  experiments  that  established  the  relationship  between  our  sharp- 
ness descriptor,  the  visual  capacity,  and  the  subjective  sharpness  of  dis- 
played Images.  In  this  study  the  relationship  between  the  just-noticeable 
difference  in  image  sharpness  was  measured  as  a function  of  display  band- 
width. At  high  spatial  frequencies  these  measurements  were  found  to  be  in 
excellent  agreement  vih  the  assertion  that  display  sharpness  is  mediated 
by  the  perceived  rms  gradient  content  of  the  image:  a quantity  proportional 

to  the  square  root  of  the  visual  capacity.  This  result  indicates  that  the 
visual  capacity  can  be  used  as  a normalizing  function  to  predict  the  effec- 
tive sharpness  of  displays  with  different  MTFs.  Another  important  result 
of  these  measurements  was  the  observation  that  the  addition  of  chrominance 
information  to  a black-and-white  image  does  not  appreciably  affect  the  per- 
ception of  image  sharpness  for  that  image.  Finally,  the  measured  results  for 
a series  of  experiments  on  the  subjective  sharpness  of  both  representative 
Images  and  single-transition  luminance  edges  proved  to  be  Indistinguishable. 

This  result  gives  additional  support  to  the  contention  that  edges  are  a 
significant  feature  in  determining  the  perceived  attributes  of  actual  scenes. 

Our  display  descriptors  for  luminance  perception,  described  in 
reports  TILL  and  TR2,  were  constructed  on  the  assumption  that  the  visual  sys- 
tem was  linear,  an  assumption  that  implicitly  limited  our  analysis  to  signals 
of  small  amplitude.  In  Section  V of  this  report  we  have  removed  this  restric- 
tion. Instead,  a model  has  been  employed  that  takes  Into  consideration  the 
nonlinearity  of  brightness  perception.  This  model,  based  on  our  interpre- 
tation of  recent  psychophysical  experiments,  also  supports  our  previous  assump- 
tion that  the  human  visual  system  responds  to  the  square  of  the  signal  ampli- 
tude. 

Also,  our  previous  efforts  had  been  directed  only  at  modeling  displayed  lum- 
inance Information.  In  Section  V of  the  report,  however,  we  have  extended  our 
descriptor  for  the  total  channel  capacity  of  a display  to  Include  both  chromi- 
nance and  luminance  information.  This  descriptor  is  based  on  a widely  accepted 
model  of  the  visual  system  that  contains  three  independent  channels : one 

channel  which  transmits  the  luminance  information,  and  two  channels  which 
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transmit  the  chrominance  information  [9}.  The  distribution  of  perceivable 
chrominance  levels  is  Included  by  using  the  measured  results  of  appropriate 
psychophysical  experiments*  As  a practical  example,  the  total  channel 
capacity  is  used  to  predict  the  optimum  allocation  for  a commercial  tele- 
vision system;  the  results  are  shown  to  be  in  good  agreement  with  current 
U.S.  television  practice. 


SECTION  II 


EFFECT  OF  COLOR  ON  PATTERN  RECOGNITION 
A.  INTRODUCTION  TO  EXPERIMENTS 

1.  Overview 

An  observer  engaged  in  visual  pattern  recognition  is  involved  in  decision 
making.  That  is,  he  must  decide  on  the  proper  assignment  of  the  inspected  vis- 
ual pattern  to  some  category  in  his  memory.  It,  therefore,  appears  reasonable 
that  the  more  information  this  pattern  presents  to  the  observer,  the  more  infor- 
mation he  has  to  base  his  decision  on;  hence  his  decision  making  for  recog- 
nition will  be  made  mo.e  efficient.  In  particular,  if  patterns  contained 
both  shape  and  color  information  one  would  assume  it  likely  that  recogniza- 
bility  would  be  better  than  if  color  information  were  absent. 

The  difficulty  with  this  assumption  is  that  it  does  not  take  into  account 
the  inherent  flexibility  that  pernL.us  a decision  maker  to  reject  or  include 
color  in  the  decision,  depending  on  how  the  visual  stimulus  is  processed. 
Although  it  is  certain  that  color  will  benefit  pattern  recognition  in  many 
circumstances,  it  is  also  likely  that  there  will  be  circumstances  in  which 
color  will  be  disregarded  in  the  recognition  process.  The  thrust  of  our  inves- 
tigation of  color  effects  in  pattern  recognition  is  to  determine,  in  a general 
sense,  when  color  is  included  in  decision  making  for  pattern  recognition. 

We  describe,  next,  the  experimental  setup  and  a data  collection  algorithm 
used  throughout  our  study.  The  three  major  sections  which  follow  are  concerned 
with  alphanumeric  recognition  after  brief  exposure  to  stimuli,  simple-shape 
recognition  after  brief  exposure  to  stimuli,  and  a new  paradigm  for  the  inves- 
tigation of  shape  recognition  in  steadily  viewed  displays  in  the  absence  of 
speed-stress.  Our  intent  is  to  point  out  general  issues  of  color  involvement 
in  pattern  recognition;  these  issues  are  summarized  in  Section  II. E. 

2.  Instrumentation  and  Procedures 

Our  experimental  setup  consisted  of  a NOVA*  2/10  minicomputer  interfaced 
through  a Lexidata**  graphics  display  board  to  an  RGB  (red-green-blue)  color 

*Made  by  Data  General  Corp.,  Southborj,  Hass. 

**Produced  by  Lexidata  Cor;  .,  Lexington,  Mass. 
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monitor.  For  Che  display  mode  used  in  our  experiments,  a centered  128  x 128 

point  screen  region  could  be  addressed  with  red,  green,  or  yellow  (red  + green) 

2 

dots.  Up  to  16,384  [(128)  ] colored  dots  could  be  displayed  at  a time,  and  the 
entire  display  could  be  changed  as  often  as  every  1/60  s (TV  field  rate).  Scan 
interlace  was  not  used  on  the  monitor. 

The  1/60-s  resolution  of  display  change  was  important  for  our  brief- 
exposure  experiments;  it  also  allowed  us  to  use  flicker  photometry  to  set  the 
red  and  green  luminances  to  be  equal.  This  was  accomplished  by  a display  of 
all  the  128  x 128  points  as  red  and  green  dots  on  alternating  1/60-s  fields 
and,  during  steady  viewing  of  this  alternating  pattern,  adjustment  of  red- 
green  ratio  so  that  a nonflickering  yellow  field  was  seen.  A silicon  photo- 
cell  was  used  to  view  this  field  alternation  on  an  oscilloscope  to  check  that 
the  red  and  green  phospor  onset  and  offset  times  were  equivalent  and  suffi- 
ciently rapid,  i.e.,  that  our  setup  did,  indeed,  provide  a 1/60-s  display 
resolution. 

The  observer  sat  behind  a desk,  his  head  about  10.5  ft  from  the  monitor 
screen,  and  used  a button  box  to  communicate  with  the  computer.  For  those 
experiments  where  brief-exposure  stimuli  were  used  (see  Sections  II. B and  II. C) , 
white  fluorescent  ceiling  lights  were  kept  on  in  the  viewing  area.  The  lights 
were  situated  behind  the  monitor  screen,  providing  diffuse  viewing-area  illu- 
mination without  screen  glare  to  redui a the  apparent  contrast  of  the  display. 
This  was  deemed  the  most  suitable  way  to  stretch  the  response-accuracy 
psychometric  function  without  removing  color  legibility,  so  that  1/60-s 
changes  xn  exposure  duration  were  small  enough  to  reveal  performance  trends. 
With  overhead  lights  on,  the  blank  monitor  screen  was  at  about  2 mL.  A 
schematic  diagram  of  the  viewing  area  arrangement  is  shown  in  Fig.  1. 

In  all„of  our  experiments,  some  type  of  threshold  measurement  was  carried 
out,  either  to  set  a stimulus  level  for  the  experiment  or  as  the  main  feature 
of  the  experiment.  This  amounts  to  finding  the  stimulus  strength  necessary  to 
yield  an  appropriate  level  of  detection  accuracy.  The  procedure  we  use  is  the 
sequential  estimation  procedure  introduced  by  Wetherill  [10].  It  is  a particu- 
larly efficient  means  for  estimating  the  desired  stimulus  parameter.  One  first 
decides  on  a step  size  to  increase  and  decrease  stimulus  strength.  The  subject 
is  run  through  a sequence  of  stimulus  trials  with  the  stimulus  strength  de- 
creased by  a step  after  a pair  of  correct  responses,  and  with  the  stimulus 
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Figure  1.  Viewing  arrangement  for  experiments. 


strength  increased  by  a step  after  a pair  of  responses  at  least  one  of  which 
is  correct.  One  then  has  runs  of  increasing  and  decreasing  steps  in  stim- 
ulus strength.  The  transition  from  runs  of  increasing  to  runs  of  decreasing 
strength  is  termed  a peak,  and  the  transition  from  decreasing  to  increasing 
runs  is  termed  a valley.  The  average  of  the  stimulus  values  at  the  peaks  and 

valleys  provides  the  estimate  of  the  stimulus  value  at  which  accuracy  sat is - 
2 2 1 

fies  the  relation  P * 1 - P , or  P " yj  “ 0.707.  In  our  procedures  we  stop 

the  threshold  determination  after  eight  peaks  and  valleys  are  produced.  We 

throw  out  the  first  peak  and  valley  to  avoid  starting  point  bias,  and  form 

our  70.72  estimate  from  the  remaining  peaks  and  valleys.  In  this  procedure 

pairs  of  trials  are  inspected.  Alternatively,  quadruples  of  trials  may  be 

4 4 

used  to  estimate  the  stimulus  value  where  P « 1 - P is  satisfied,  i.e.,  where 
P * 0.841.  The  84. 1Z  estimate  was  used  only  in  our  experiment  in  Section  H.C. 
In  experiments  where  stimulus  strength  was  varied  by  changing  exposure  duration. 
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Che  step  size  for  the  duration  change  was  1/60  s.  In  the  experiment  of  Section 
II. D,  dot  number  was  the  parameter  of  interest;  the  step  size  was  set  at  500 
dots.  These  procedures  were  carried  out  by  the  computer. 

B.  INFLUENCE  OF  COLOR  IN  LETTER  AND  WORD  RECOGNITION 
1.  Introduction 

There  are  two  ways  in  which  color  can  be  used  for  alphanumeric  stimuli  - 
as  a redundant  or  a nonreducdanl'  code.  An  example  of  the  nonredundant  use  of 
color  is  the  commonly  used  red  coloring  of  LED  (light-emitting  diode)  displays. 
The  use  of  color  in  this  situation  is  primarily  an  issue  of  making  display 
visibility  possible  in  the  sense  of  providing  a match  to  the  spectral  sensi- 
tivity of  the  eye.  Consequently,  a poor  color  choice  can  usually  be  rendered 
acceptable  simply  by  increasing  the  display  brightness,  or  contrast.  The 
color,  per  se,  is  irrelevant  and  does  not  aid  the  observer  in  deciding  what 
the  displayed  message  is.  On  the  other  hand,  color  can  also  be  used  as  a 
redundant  stimulus  code.  The  question  we  will  concern  ourselves  with  here 
has  to  do  with  the  potential  for  color  as  a recognition  aid  when  the  color  is 
used  in  a redundant  fashion. 

This  is  not  a new  issue  for  experimental  psychology.  It  was  addressed, 
for  example,  by  Eriksen  and  Hake  [11],  who  concluded  that  completely  redundant 
color  coding  of  stimulus  shape  is  an  effective  means  for  enhancing  stimulus 
discrimination.  The  general  topic  has  been  discussed  by  Garner  [12]. 

Nevertheless,  there  is  reason  to  wonder  whether  such  color  benefit  con- 
clusions based  on  controlled  laboratory  experiments  can  be  justifiably  gen- 
eralized to  a nonlaboratory  setting.  The  particular  point  that  concerns  us 
is  one  of  attention  priority.  IT  one  carefully. instructs  an  observer  in  an 
experiment  about  the  availability  of  correlated  shape  and  color  attributes  of 
the  stimulus,  then  insofar  as  the  observer  makes  use  of  these  instructions  and 
attends  to  both  stimulus  attributes,  one  would  expect  a performance  benefit 
due  to  the  color  coding  of  shape.  Outside  of  the  laboratory,  however,  one 
might  often  expect  a natural  tendency  to  attach  priority  to  the  processing  of 
stimulus  shape.  Priority  of  attention  to  shape  is  particularly  likely  for 
alphanumeric  stimuli,  where  a lifetime  of  experience  orients  one  to  read  a 


displayed  message  without  concern  for  its  color,  or  even  for  precise  details 
of  shape  [13].  It  is,  therefore,  worth  asking  whether,  in  a situation  where 
priority  of  attention  is  given  to  shape  processing  for  alphanumeric  stimuli, 
a benefit  of  color  coding  is  demonstrated. '^-^his  is  the  question  to  which  we 
address  ourselves  here.  ^ 

2.  Experimental  Procedure 

The  sequence  of  photographs  shown  in  Figs.  2 and  3 represents  typical 
stimulus  trials.  The  observer  viewed  a yellow  fixation  symbol  (Figs.  2a  or 
3a)  on  the  monitor  and,  when  ready,  pressed  a button  which  caused  the  im- 
mediate presentation  of  a red  or  green  stimulus  centered  directly  above  the 
screen  location  of  the  removed  fixation-symbol  arrow.  The  stimulus  could  be 
either  a letter  (Fig.  2b)  or  a four-letter  word  (Fig.  3b).  When  a predeter- 
mined brief  duration  had  elapsed,  the  stimulus  was  immediately  removed  from 
the  screen  and  replaced  with  a postexposure  mask  consisting  of  1250  red  and 
1250  green  dots,  randomly  located  in  a 2. 6“  x 2.0°  region  covering  the  stimulus 
location.  Adjacent  to  the  mask  were  two  response-choice  letters,  one  in  red 
and  one  in  green.  The  choices  plus  mask  are  shown  in  Figs.  2c  and  3c.  The 
subject's  task  was  to  inspect  the  choices  and  decide  which  of  the  two  letters, 
disregarding  color,  had  just  appeared  in  the  briefly  displayed  stimulus. 
Selection  of  the  top  or  bottom  choice  on  the  screen  was  indicated  by  the 
pressing  of  one  of  two  buttons.  The  correct  response  In  Fig.  2c  is  clearly 
the  top  letter.  The  correct  response  in  Fig.  3c  is  the  top  letter,  since  D 
appeared  in  the  stimulus.  Accuracy  feedback  was  displayed  (in  yellow)  im- 
mediately after  a response  was  given  (Figs.  2d  and  3d). 

In  deciding  between  D or  N in  the  response  situation  of  Fig.  3c,  there 
is  no  redundancy  benefit  to  having  processed  "WORD"  in  the  stimulus  (Fig.  3b), 
since  either  choice  letter  could  be  used  in  the  target  letter  location  to 
make  up  a common  word.  All  word  stimulus  trials  had  this  property;  the  target 
letter  and  the  incorrect  choice  could  both  be  used  in  the  critical  letter 
position  to  construct  a common  word  with  the  nontarget  letters.  The  target 
letter  position  in  the  word  was  randomized  for  word  stimuli  in  the  experiment. 
This  forced-choice  procedure  was  introduced  by  Reicher  [14]  for  Investigating 
processes  in  word  recognition,  and  is  regarded  as  the  safest  procedure  for 
minimizing  guessing  as  a source  of  spurious  results  [15] . 
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c.  MASK  WITH  CHOICES 


d.  FEEDBACK 


Figure  2.  Letter  stimulus  trial.  Fixation  symbol  (a)  is 
followed  by  brief  letter  exposure  (b) ; this  is 
followed  by  dot  mask  and  response  choices  (c) . 
Accuracy  feedback  (d)  follows  response. 
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Figure  3.  Word  stimulus  trial.  Similar  to  Fig.  2,  except 
that  the  stimulus  (b)  is  a four-letter  word. 
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Stimulus  and  choice  letters  (as  well  as  the  random  dot  mask)  all  had  a 
luminance  of  about  10  mL,  regardless  of  color.  Letters  subtended  about  one 
fourth  of  a degree,  and  were  made  up  of  dots  subtending  less  than  3 min  of 
arc.  The  viewing  area  was  illuminated  by  overhead  lights  (see  Section  II.A.2.). 

The  stimulus  color  was  equally  likely  to  be  red  or  green,  and  the  correct 
response  choice  was  as  likely  to  be  the  same  color  as  the  stimulus  as  not. 
Placement  of  the  correct  choice  in  the  top  or  bottom  response  position  was 
random.  The  random  dot  pattern  used  for  afterimage  masking  was  different  on 
each  trial.  The  subject's  task  was  to  attend  only  to  the  letters  (shape)  of 
the  stimulus  and  to  select  that  choice  letter  which  was  thought  to  be  pre- 
sented in  the  stimulus.  Subjects  were  Instructed  to  disregard  color,  since 
choice  selection  based  on  color  was  as  likely  to  be  wrong  as  correct.  The 
experimental  situation,  then,  is  one  in  which  priority  of  attention  is  given 
to  shape  processing. 

The  color  situations  in  this  experiment  are  outlined  in  Table  1. 


TABLE  1.  STIMULUS-CHOICE  COLOR  COMBINATIONS 

Stimulus  Color  Color  of  Correct  Response  Choice 

1 Red  Red 

2 Red  Green 

3 Green  Green 

4 Green  Red 

There  were  four  color  situations  and  two  stimulus  types:  letters  and 

words.  Consequently  there  were  eight  different  conditions,  randomly  inter- 
spersed throughout  an  experimental  session.  There  were  304  experimental 
trials  per  session;  eleven  observers  were  tested  in  different  sessions.  The 
exposure  duration  for  stimuli  was  set  individually  for  an  observer  so  that 
recognition  accuracy  would  be  better  than  chance  but  less  than  perfect,  so 
that  the  data  could  reveal  systematic  error  trends.  This  exposure  duration, 
typically  1/30  s,  put  accuracies  in  the  oeighborhood  of  75%  correct. 
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An  experimental  session  lasted  about  an  hour  and  consisted  of  the 
following:  The  observer  was  Introduced  to  the  experiment  with  a short  in- 
struction phase  in  which  six  trials,  three  with  letter  and  three  with  word 
stimuli,  were  presented;  stimulus  exposure  was  set  at  500  ms.  Then  a warm-up 
phase  was  run  through.  The  warm-up  phase  had  32  trials  - a random  ordering 
of  16  letter-stimulus  and  16  word-stimulus  trials  - with  stimulus  exposure 
set  at  100  ms . Next,  the  proper  stimulus  exposure  duration  for  the  observer 
was  determined.  This  consisted  of  applying  the  general  sequential  estimation 
procedure,  outlined  in  Section  II. A. 2.  (Instrumentation  and  Procedures)  to 
temporal  threshold  determination.  Only  letter  stimuli  were  used  in  this 
timing  phase.  After  completion  of  the  timing  phase  the  main  phase  of  the 
experiment  was  run.  To  minimize  fatigue  problems,  a five-minute  intermission 
occurred  half  way  through  the  main  phase. 

3.  Results 

The  first  question  of  interest  is  this:  Is  the  color  used  in  the  recogni- 

tion task?  In  other  words,  is  the  observer  more  likely  to  be  correct  in  his 
response  if  the  correct  choice  is  the  same  color  as  the  stimulus  (same  condi- 
tion) than  if  it  is  in  the  opposite  color  (opposite  condition)?  This  question 
is  answered  in  Table  2. 

TABLE  2.  COLOR  EFFECT  ON  PERFORMANCE 

Stimulus  Type  Same  Condition  Opposite  Condition  Difference 

Words:  73.12  65.02  8.12 

Letters:  74. 42  66.42  8.02 

The  table  entries  are  percent  correct,  averaged  across  eleven  observers. 
Of  interest  are  the  entries  for  the  column  marked  "Difference."  There  was  a 
systematic  difference  between  the  same  and  opposite  conditions,  amounting  to 
an  82  difference  in  recognition  accuracies.  The  effect  was  significant 
(p  <0.01,  Wilcoxon  signed-rank  test). 

The  above  results  shew  that  color  is  used  by  observers  in  the  recognition 
task.  We  have  involved  observers  in  a task  where  priority  of  attention  is  to 
stimulus  shape,  a condition  which  presumably  simulates  the  natural  tendency  in 
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dealing  with  alphanumeric  material.  One  might  wonder  whether,  in  this  somewhat 
realistic  cognitive  setting.  Information  available  in  the  stimulus  color  makes 
its  presence  felt,  in  the  sense  of  affecting  recognition  performance.  According 
to  our  results  it  does. 

Of  further  interest  is  whether  there  are  accuracy  differences  between 
red  and  green  stimuli.  That  Is,  quite  aside  from  the  issue  of  color  coding, 
is  there  any  difference  between  red  and  green  stimuli  In  terms  of  recogniz- 
ability?  From  Table  2 we  see  that  the  average  accuracy  on  word  trials  was 
69Z  (the  average  of  the  73. 1Z  and  65Z  entries  of  Table  1),  and  the  average 
recognition  accuracy  on  letter  trials  was  70. 4Z  (average  of  74.4 Z and  66. 4Z). 
These  average  accuracies  are,  in  turn,  averages  across  the  two  stimulus  colors, 
red  and  green.  The  breakdown  by  color  is  shown  in  Table 


TABLE  3.  EFFECT  OF  STIMULUS  COLOR  ON  PERFORMANCE 


Stimulus  Type 

Red  Stimuli 

Green  Stimuli 

Average 

Words : 

75Z 

63Z 

69Z 

Letters: 

74Z 

67Z 

70. 5Z 

The  red-green  accuracy  difference  was  significant  for  words  (p  <0.01). 
However,  this  tendency  was  not  systematic  for  letter  stimuli  (p  - 0.42) . Since 
the  luminance  values  of  the  red  and  green  stimuli  were  matched,  one  would  not 
expect  any  differences  between  red  and  green  stimuli  to  be  due  to  the  spectral 
sensitivity  of  the  eye.  The  source  of  such  an  effect  is  probably  cognitive, 
involving  high-level  processing.  This  is  suggested  by  the  different  red-green 
accuracy  differential  obtained  in  word  processing  and  in  single-letter  process- 
ing, respectively;  word  processing  involves  cognitive  processes  different  from 
those  that  are  operative  in  letter  processing  [16,17].  In  any  event,  one  can 
say  that  a benefit  for  red  over  green  is  strongly  suggested.  That  is  in  agree- 
ment with  results  obtained  in  a different  context,  by  Tyte  et  al.  [18]. 

We  next  ask  whether  there  is  any  response  bias  for  color.  This  is,  do 
observers  have  any  systematic  tendency  to  respond  in  favor  of  one  color  over 
the  other?  This  is  of  interest  in  display  design  in  terms  of  the  likelihood 
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of  false  alarms  with  colored  displays.  Information  concerning  this  question 
is  recovered  from  the  data  by  the  following  relations: 

P (respond  red)  = ^ (A  + (1-B)  + C + (1-D) ] 

P (respond  green)  = J [(1-A)  + B + (1-C)  + D] 
where 

A **  P (pick  red  choice  given  red  stimulus) 

B « P (pick  green  choice  given  red  stimulus) 

C » P (pick  red  choice  given  green  stimulus) 

D = P (pick  green  choice  given  green  stimulus) 

Applying  these  relations  we  found  the  following  probability  estimates: 

For  letters:  P (respond  red)  = 0.54 

P (respond  green)  « 0.46 
For  words:  P (respond  red)  - 0.54 

P (respond  green)  = 0.46 

The  tendency  to  respond  red  was  significant  for  both  letters  and  words 
(p  <0.01).  The  response  bias  in  favor  of  red  is  interesting  In  light  of  the 
conventional  use  of  red  as  a danger  or  emergency  color  code.  Our  results 
suggest  a readiness  to  select  this  color  even  in  a situation  where  the  red 
coloring  has  no  particular  significance.  This  probably  means  that  the  cul- 
turally specified  significance  of  red  is  maintained  even  when  it  is  not 
intended  in  a particular  setting.  The  choice  of  red  color  coding  should, 
therefore,  be  reserved  for  messages  where  the  cost  of  message  rejection  is 
high,  but  the  penalty  of  a false  alarm  is  not  prohibitive. 

The  response  bias  for  red  is  a different  effect  from  the  apparently 
better  reccgnizability  of  red  stimuli.  This  follows,  as  one  would  expect,  a 
high  degree  of  correlation  between  bias  and  accuracy  if,  say,  the  bias  were 
the  source  of  the  apparent  accuracy  benefit  of  red  coloring.  For  letter 
stimuli  the  correlation  was  0.13,  not  significantly  different  from  zero.  For 
word  stimuli  the  correlation  was  0,57,  a value  only  on  the  verge  of  signifi- 
cance (p  « 0.07). 


16 


C.  STATUS  OF  COLOR  AS  A SHAPE- INDEPENDENT  STIMULUS  FEATURE 


1.  Introduction 

If  color  and  shape  are  perceptually  processed  as  independent  dimensions 

of  a stimulus,  then  the  benefit  of  color  coding  of  shape  should  be  clear-cut. 

In  particular,  if  in  some  inspection  time  t the  probability  of  recognizing 

only  the  stimulus  shape  is  P (t)  and  the  probability  of  recognizing  only  the 

8 

stimulus  color  is  Pc(t)  then,  assuming  processing  independence,  the  probability 
of  recognizing  the  stimulus  given  the  availability  of  both  color  and  shape  is 

P , (t)  - 1 - [1-P  (t>][l-P(t)]  > max  [P  (t),P  (t)]  (1) 

Color  coding  advantage  can,  of  course,  be  demonstrated,  as  was  shown  in 
Section  II. B. 3.  However,  it  is  not  obvious  that  such  an  advantage  is  necessarily 
a consequence  of  perceptual  efficiency  in  the  sense  of  Eq.  (1),  rather  than, 
say,  a consequence  of  enhanced  memory  representation  of  the  stimulus  so  that 
the  retrieval  from  memory  necessary  for  response  identification  Is  more  likely 
to  be  correct.  The  distinction  is  one  of  generality.  If  observed  color  ad- 
vantage to  stimulus  identification  is  a consequence  of  Eq.  (1),  then  perceptual 
issues  govern  the  advantage.  This  advantage  could,  therefore,  be  assumed  to 
be  a consequence  of  a type  of  visual  processing  which  is  not  likely  to  be  unique 
to  a particular  experimental  setting.  On  the  other  hand,  if  cognitive  issues 
dominate,  then  a completely  general  experimental  demonstration  of  color  ad- 
vantage is  not  likely  to  occur.  It  is,  therefore,  important  to  determine 
whether  Eq.  (1)  can  be  shown  to  govern  the  identification  of  color-coded  shape. 

2.  Experimental  Procedure 

We  used  an  alphabet  of  four  items  in  this  experiment:  a positively  sloped 

diagonal  line  and  a negatively  sloped  diagonal  line,  colored  either  red  or 
green.  The  length  of  the  lines  correspond  to  8 min  of  viewing  angle.  The 
stimuli  are  summarized  in  Table  4. 
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TABLE  4.  STIMULUS  ALPHABET 


Color 

Red  Green 

1 2 

3 4 

An  experimental  trial  consisted  of  the  following  sequence  of  events:  The 

observer  fixated  a point  on  a monitor  screen  and  pressed  a button  which  caused 
our  computer  to  immediately  select  one  of  the  four  members  of  our  stimulus 
alphabet  and  randomly  locate  it  in  a centered  1.7°  x 1.2°  viewing  region  for  a 
controlled  duration.  At  the  offset  of  the  stimulus,  a 2.6®  x 2.0®  region,  en- 
compassing the  stimulus  viewing  region,  was  covered  with  a dense  random  mask 
of  the  diagonal  colored  lines.  This  mask  was  constructed  by  random  placement 
of  240  stimuli  of  type  1,  2,  3,  and  4 of  the  stimulus  alphabet  (960  lines  in 
all).  Different  questions  were  asked  for  this  type  of  presentation  in  dif- 
ferent phases  of  the  experiment,  depicted  in  Figs.  4-7. 

In  one  phase  (Fig.  4),  the  observer  was  asked  to  identify  the  shape  of 
the  stimulus,  disregarding  color.  This  was  a binary  choice  (positive  or  nega- 
tive slope)  Indicated  by  a button  press  to  the  computer.  In  another  phase 
(Fig.  5),  the  observer  was  asked  to  identify  the  stimulus  color  (red  or  green), 
disregarding  shape,  by  means  of  a similar  button  press.  Data  from  these  two 
experiment  phases  were  used  to  estimate  the  observer’s  psychometric  curves  for 
color  or  shape  identification  as  a function  of  the  stimulus  viewing  time,  for 
the  colored  stimulus  lines  of  this  experiment. 

The  main  point  of  this  experiment  was  to  determine  how  shape  and  color 
information  are  combined  for  color-coded  shapes.  The  observed  psychometric 
functions  for  pure  shape  identification  and  pure  color  Identification  were 
used  to  predict  recognition  performance  for  color-coded  shape  under  the  assump- 
tion that  color  and  shape  are  independently  processed.  In  the  phase  of  the 
experiment  where  shape  was  color  coded,  a subset  of  the  stimulus  alphabet  was 
used.  For  example,  only  stimuli  1 and  4 were  presented  in  a phase  (Fig.  6), 
and  the  observer  had  to  decide  which  one  was  presented  on  a trial.  In  this 
case  a positively  sloped  line  was  always  red,  and  a negatively  sloped  line 


Slope 


Positive: 

Negative: 
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o.  FIXATION  SYMBOL 


b.  STIMULUS 


C.  MASK  WITH  CHOICES  d FEEDBACK 
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Figure  4.  Shape-recognition  trial.  Fixation  symbol  (a)  is  followed  by 
brief  exposure  of  positively  or  negatively  sloped  diagonal 
line  in  a random  location  in  the  field  of  view  (b);  this  is 
followed  by  a line  mask  and  response  choices  (c).  Accuracy 
feedback  (d)  follows  response.  The  stimulus  line  is  equally 
likely  to  be  red  or  green  and  is  equally  likely  to  be  positively 
or  negatively  sloped. 
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Figure  5.  Color  recognition  trial.  Similar  to  Fig.  4,  except  that  the 
subject  must  respond  to  the  stimulus  color. 
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a.  FIXATION  SYMBOL  b.  STIMULUS 


C.  MASK  WITH  CHOICES  d.  FEEDBACK 


Figure  6.  Color-coded  shape  trial:  version  A.  Color  and  shape  were 

completely  redundant,  so  that  a line  of  positive  slope  (b)  was 
always  red  (PR) , and  a line  of  negative  slop  was  always  green 
(NG) , The  subject  indicated  which  stimulus  was  presented,  by 
giving  the  appropriate  response  (c). 


was  always  green.  The  complementary  experimental  phase  in  which  only  stimuli 
2 and  3 of  the  alphabet  were  presented  was  also  run  in  each  session  (Fig*  7). 

Wi-  estimate  the  desired  psychometric  functions  by  assuming  that  the  func 
tion  i„  cumulative  normal  [10].  In  this  case,  the  function  plotted  on  prob- 
ability .'aper  '.s  a straight  line.  Hence,  the  problem  of  estimating  the 
psychometric  function  reduces  to  fitting  a straight  line  on  probability  paper 
two  points  suffice  for  this.  The  two  points  we  chose  to  estimate  were  the 
stimulus  exposure  durations  necessary  to  achieve  70.7%  and  84.1%  accuracy 
levels  in  the  two-alternative-forced-choice  situation  of  this  experiment.  Th 


a.  FIXATION  SYMBOL 


b.  STIMULUS 


c.  MASK  WITH  CHOICES 


d.  FEEDBACK 


Figure  7.  Color-coded  shape  trial:  version  B,  Similar  to  Fig.  6, 

except  that  the  line  of  negative  slope  was  always  red  (NR), 
and  the  line  of  positive  slope  (b)  alwavs  green  (PG) . 


70.7%  and  84.1%  points  are  the  most  tractable  (see  Section  II. A. 2).  Estimation 
trials  for  the  70.7%  and  84.1%  points  were  interleaved  randomly. 

After  a number  of  practice  runs  through  the  experiment,  the  observer  (J.J.M.) 
submitted  to  six  replications  of  the  experiment.  Each  data  point  used  in  the 
psychometric-function  estimate  was  based  on  more  than  700  stimulus  trials. 

3.  Results 

The  experimental  outcome,  presented  on  probability  paper,  is  shown  in 
Fig.  8.  The  curves  labeled  SHAPE  and  COLOR,  respectively,  are  the  estimated 
psychometric  functions  (straight  lines  on  probability  paper)  for  these  two 
attributes  in  the  experiment.  The  SHAPE  curve  represents  the  accuracy  level 
(%  correct)  as  a function  of  exposure  duration  for  the  pure-shape-recognition 
task.  The  COLOR  curve  represents  the  corresponding  information  for  the  pure- 
color-recognition  task. 

Applying  Eq.  (1)  to  the  SHAPE  and  COLOR  curves,  we  get  the  predicted 
psychometric  function  for  color-coded  shape  recognition.  This  predicted  curve 
is  shown,  as  are  the  data  points  for  the  color-coded  shape  task.  Since  the 
84.1%  exposure  time  is  significantly  longer  than  that  predicted  from  Eq.  (1) 
(Wilcoxon  signed-rank  test,*  p <0.05),  the  hypothesis  that  available  color 
and  shape  information  are  independently  processed  and  used  in  the  recognition 
judgment  is  not  supported.  At  the  70.7%  accuracy  exposure  duration,  only  shape 
information  is  used  (data  point  falls  on  SHAPE  curve);  this  is  expected,  since 
color  Information  is  not  available  at  this  brief  exposure.  Around  the  84.1% 
accuracy  exposure  duration  one  cannot  decide  whether  the  subject  uses  only  the 
shape  or  only  the  color  information  of  the  stimulus,  since  the  84.1%  points 
for  color,  shape,  and  color— coded  shape  are  not  significantly  different.  What 
is  clear,  however,  is  that  the  observer  does  not  use  both  color  and  shape  in 
an  average  stimulus  trial. 

Tangentially,  it  is  worth  noting  that  Fig.  8 shows  different  decision 
mechanisms  to  be  involved  in  the  recognition  of  shape  and  color,  as  indicated 
by  the  significantly  different  slopes  of  the  color  and  shape  probability  plots 
of  the  psychometric  functions  (p  < 0.02). 

★Differences  between  estimated  and  observed  exposure  duration  of  less  than  1 ms 
were  considered  not  different  in  the  analysis.  A one-sided  test  was  used  here, 
slc.ee  one  does  not  expect  durations  shorter  than  predicted  by  Eq.(l). 
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Figure  8.  Results  for  color,  shape,  and  color-coded  shape  recognition 
for  observer  J.J.M.  Independent  processing  of  color  and 
shape  in  a color-coded  shape-recognition  task  is  not 
indicated . 

Garner  [19]  has  introduced  a distinction  between  what  he  calls  integral 
and  separable  stimulus  dimensions.  A multidimensional  stimulus  may  have  more 
than  one  dimension  as  far  as  the  experimenter  is  concerned,  but  may  or  may  not 
be  dealt  with  as  multidimensional  by  the  observer.  In  Garner's  terminology, 
if  stimulus  dimensions  are  dealt  with  as  integral  they  are  naturally  brought 
together  by  the  observer,  so  that  the  stimulus  is  not  processed  as  an  item  with 
many  attributes,  but  as  a unit  which  may  be  decomposed  into  its  attributes.  In 
this  case  a performance  benefit  due  to  increased  dimensionality  is  expected. 

In  terms  of  multidimensional  scaling,  a Euclidean  metric  should  hold  for  inte- 
gral dimensions.  If  the  stimulus  dimensions  are  dealt  with  as  separable,  then 
the  stimulus  is,  in  fact,  processed  as  an  item  with  distinct  attributes.  In 


this  case  dimensional  preferences  are  expected,  making  a benefit  of  increased 
dimensionality  less  likely.  A city-block  metric  should  hold  for  separable 
dimensions. 

Gamer's  concepts  are  well  suited  to  the  results  described  here  and  in 
the  preceding  section  (II. B).  Apparently  what  we  have  demonstrated  in  the  cur- 
rent experiment,  with  color-coded  shape  recognition,  is  that  the  dimensions  of 
color  and  shape  are  separable.  As  they  are  not  united  by  the  observer,  dimen- 
sional preferences  do  exist.  Consequently,  a performance  benefit  due  to  color 
coding  Is  not  found.  The  preceding  experiment  with  colored  letter  and  word 
stimuli  can,  however,  be  interpreted  as  demonstrating  that  color  and  shape  are 
integral  dimensions,  since  the  stimulus  coloring  intrudes  into  shape-recog- 
nition decisions.  Hence,  since  the  dimensions  are  integral,  one  should  expect 
a benefit  of  increased  stimulus  dimensionality  (provided  the  color  is  used  in 
an  informative  sense). 

When  these  two  experiments  are  viewed  together,  the  potential  for  color 
benefit  clearly  appears  to  be  an  issue  of  the  cognitive  handling  of  the  avail- 
able dimensions  of  color  and  shape.  In  the  case  of  alphanumeric  stimuli,  color 
and  shape  appear  to  be  integral  dimensions;  this  would  indicate  a benefit  of 
color  for  such  display  material.  For  simple  shape  recognition,  color  and  shape 
appear  to  be  separable  dimensions  with  a consequent  lack  of  benefit  from  color 
coding.  This  result  strongly  suggests  that  the  likelihood  of  color  benefit  for 
a particular  type  of  display  corresponds  to  the  likelihood  of  the  color  and 
shape  information  being  dealt  with  as  Integral,  rather  than  separable,  dimensions. 


D.  INTRUSION  OF  COLOR  INTO  SHAPE-RECOGNITION  DECISION  PROCESSES  IN 
STEADILY  VIEWED  DISPLAYS 

1.  Introduction 

In  the  preceding  sections  we  have  explored  the  involvement  of  color  in 
pattern  recognition  with  brief-exposure  stimuli.  Since  the  beginnings  of  modern 
experimental  psychology  the  use  of  brief  exposures  has  been  a popular  device 
for  degrading  the  visual  stimulus  so  that  systematic  errors  in  performance 
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might  be  uncovered , end  rules  of  visual  processing  thereby  revealed  [2Q]  . It 
is  in  this  spirit  that  we  have  used  brief-exposure  stimuli.  However,  such  an 
approach  Introduces  a temporal  issue  which  quite  possibly  does  not  fully  simu- 
late the  situation  of  an  observer  processing  a steadily  viewed  display.  For 
example,  a pocential  complication  is  that  all  Fourier  components  do  not  reach 
visual  decision  centers  at  the  same  rate  [21,22].  Hence  the  briefly  viewed 
stimulus  might  tend  to  emphasize  certain  aspects  of  the  stimulus  energy  spec- 
trum which  might  be  of  far  less  importance  under  steady  viewing. 

Performance  measures  based  on  reaction  time  (RT)  have  become  increasingly 
popular.  However,  it  has  recently  become  apparent  that,  for  various  reasons, 
generalizations  based  on  RT  measures  are  particularly  prone  to  error.  First, 
there  is  the  point  about  different  Fourier  components  of  the  stimulus  being 
processed  at  different  rates;  this  puts  undue  emphasis  on  the  more  rapidly 
processed  components  in  a speed-stress  situation.  Differential  frequency  pro- 
cessing rates  were  recently  demonstrated  with  an  RT  experiment  [23].  A better- 
known  issue  is  the  complication  of  the  speed/accuracy  trade-off  which  involves 
subtleties  of  the  Interaction  between  response  speed  and  response  accuracy 
[24,25].  The  use  of  RT  measures,  therefore,  would  not  seem  to  offer  us  any 
advantage  for  understanding  issues  related  to  pattern  recognition  in  steadily 
viewed  displays. 

Contrast-threshold  techniques  are  appropriate  for  steadily  viewed  dis- 
plays. However,  their  use  in  the  investigation  of  color  involvement  in  pattern 
recognition  is  problematic  because  the  contrast  threshold  is  more  properly  con- 
cerned with  the  stimulus  strength  required  for  visibility,  rather  than  recog- 
nlzability.  In  studying  pattern  recognition  we  are  concerned  with  the  high- 
level  cognitive  processing  of  visible  information,  as  opposed  to  the  sensory 
processing  of  information  required  to  render  the  stimulus  visible.  The 
significance  of  contrast- threshold  measures  for  display  descriptors  is  dis- 
cussed in  Section  V of  this  report. 

The  fact  is  that  the  study  of  pattern  visual  information  processing  for 
pattern  recognition  in  steadily  viewed  displays  has  traditionally  involved 
Indirect  physiological  or  physical  measurements,  such  as  eye  movement  and  EEG 
recording,  whose  interpretation  must  often  be  dealt  with  at  a qualitative 


level.  There  does  not  appear  to  be  a useful  paradigm  for  the  systematic 
psychophysical  Investigation  of  the  problem.  We  have,  therefore,  devised 
one.  Our  approach,  which  has  its  origins  in  the  work  of  A.  Rose  [3],  involves 
the  detection  of  structure  in  a random  dot  pattern.  An  example  is  given  in 
Fig.  9. 

Figure  9a  contains  a random  dot  pattern;  Fig.  9b,  a line  drawing  of  a 
triangle.  Fig.  9c  was  produced  by,  in  effect,  placing  the  triangle  of  Fig.  9b 
over  the  dot  pattern  of  Fig.  9a  so  that  dots  are  obscured  at  the  location  of 
the  triangle.  The  triangle  portrayed  in  Fig.  9c  can  be  seen  by  viewing  this 
figure  from  a distance  of  a few  feet  (a  poor  reproduction  of  the  figure  may 
make  the  recognition  of  the  triangle  difficult).  One  can  reproduce  Fig.  9b 
from  Fig.  9c  by  increasing  the  dot  density  in  Fig.  9c  so  that  individual  dots 
are  no  longer  resolved.  Conversely,  one  can  go  from  Fig.  9b  to  Fig.  9c  by  a 
decrease  in  dot  density;  one  can  also  decrease  the  dot  density  further  so  as 
to  render  the  triangle  imperceptible.  By  varying  the  dot  density,  we  effec- 
tively vary  the  signal-to-noise  ratio  of  the  bilevel  display  of  the  triangle. 

A measure  of  the  observer's  ability  to  recognize  the  triangle  in  the  dot  field 
can  therefore  be  obtained  in  terms  of  the  dot  density  required. 

Clearly,  the  perception  of  the  triangle  in  the  dot  pattern  of  Fig.  9c  in- 
volves high-level  cognitive  judgments  as  to  which  dot  arrangements  are  suffi- 
ciently haphazard  to  be  considered  noise  and  which  ones  appear  sufficiently 
regular  to  be  considered  signal.  These  are  active  judgments  concerned,  with 
pattern  recognition,  as  opposed  to  pattern  visibility.  We  are  addressing 
issues  of  pattern  recognition  with  this  type  of  pattern,  and  we  are  addressing 
them  by  means  of  a steadily  viewed  display. 

We  have,  therefore,  a type  of  visual  stimulus  which,  when  steadily  viewed, 
can  either  reveal  a pattern  or  not,  depending  on  the  value  of  the  dot  density. 
The  task  for  the  observer  is  one'of  pure  shape  recognition.  The  question  we 
will  pose  is,  will  color  influence  this  ability? 

All  of  the  information  appropriate  for  perception  of  the  pattern  portrayed 
in  the  dot  field  is  contained  in  the  presence  or  absence  of  dots,  i.e.,  in  the 
luminance  profile  of  the  dot  field.  As  long  as  the  dots  are  visible,  their 
color  is  irrelevant.  More  to  the  point,  if  we  color  the  dots  without  modifying 
the  luminance  profile  of  the  display  we  do  not  alter  the  signal-to-noise  ratio 
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Figure  9.  Portrayal  of  shape  in  a random  dot  pattern.  The 
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of  the  portrayed  pattern.  Hence,  if  we  randomly  color  the  dots  of  the  type  of 
display  given  in  Fig.  9c  ve  Introduce  random  color  noise  without  modifying  the 
luminance  signal-to-noise  ratio  of  the  pattern.  By  introducing  this  type  of 
color  noise  ve  neither  modify  the  observer's  task,  nor  do  we  detract  from  the 
information  available  for  shape  recognition  in  the  dot  field.  What  ve  do  is 
to  open  up  a new  dimension  of  the  display,  a dimension  of  color  variation.  The 
observer  will  operate  efficiently  if  he  disregards  this  dimension.  It  is  not 
certain  that  he  can.  The  degree  to  which  noise  in  the  color  dimension  intrudes 
into  the  shape-recognition  processing  that  operates  on  the  dot  field's  luminance 
profile  gives  us  a measure  of  the  impact  of  color  information  on  shape  recog- 
nition in  steadily  viewed  displays. 

2.  Experimental  Procedure 

A typical  stimulus  trial  is  shown  in  Figs.  10  and  11.  A "ready  fixation" 
symbol,  in  yellow,  was  first  displayed  on  the  monitor  (Figs.  10a  and  11a). 

Wien  the  ‘subject  was  ready  to  initiate  a trial,  a yellow,  base-down  triangle 
was  presented  in  a random  screen  location  for  500  ms  (Figs.  10b  and  lib).  This 
triangle  was  presented  to  serve  as  a visual  template  for  the  immediately  fol- 
lowing steady  display  of  a random  dot  field  (Figs.  10c  and  11c).  The  subject's 
task  was  to  inspect  the  random  dot  field  for  as  long  as  desired  to  decide 
whether  or  not  the  Just-seen  triangular  shape  was  portrayed  in  the  dot  field. 
The  triangle  was  equally  likely  to  be  present  or  absent;  If  it  was  portrayed 
in  the  dot  field,  its  location  was  random.  After  the  subject  pressed  a button 
to  indicate  presence  or  absence  of  the  triangle,  a yellow  accuracy-feedback 
message  was  presented  on  the  screen  (Figs.  lOd  and  lid) . The  reason  for  ran- 
domizing the  location  of  the  template  triangle  and  the  subsequent  dot— field 
portrayal  of  the  triangle  was  to  avoid  the  likelihood  of  enhanced  appearance 
of  the  dot-field  triangle  because  of  location  congruence  with  the  template, 
which  might  allow  the  observer  to  bypass  the  dot-field  analysis. 

Two  triangle  sizes  were  used;  a smaller  triangle  was  about  1.2°  on  a side 
(Fig.  10b),  and  a larger  triangle  twice  this  s.  ze  (Fig.  lib).  Three  versions 
of  dot  field  coloring  were  used;  the  fi®ld  could  contain  only  red  dots,  only 
green  dots,  or  a red-green  mixture  composed  of  an  equal  nisaber  of  red  and  green 
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Q.  READY  (FIXATION)  SIGN 


b.  STIMULUS  TEMPLATE 


c.  DOT  FIELD  INSPECTED  FOR 
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d.  FEEDBACK 


Figure  10.  Small-triangle  detection  trial.  The  fixation  symbol  (a)  is 

followed  by  a quick  view  of  the  triangle  shape  to  be  detected 
(b).  The  observer  inspects  the  dot  field  (c)  and  then  gives 
a button  response  to  indicate  whether  or  not  he  thinks  the 
triangle  is  present.  Accuracy  feedback  (d)  follows  the  re- 
sponse. 
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a.  READY  (FIXATION)  SIGN 


b.  STIMULUS  TEMPLATE 


c.  DOT  FIELD  INSPECTED  FOR 
PRESENCE  OF  TRIANGLE 


d FEEDBACK 


Figure  11. 


Large- triangle  detection  trial.  Similar  to  Fig.  10, 
except  that  the  triangle  size  is  doubled. 
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dots.  The  doc  liainmie  (10  aL)  did  not  depend  on  coloring.  The  two  triangle 
sices  were  used  In  separate  back-to-back  sessions  for  each  observer.  The 
ordering  of  which  triangle  size  was  used  first  was  balanced  across  the  ob- 
servers. For  each  triangle-size  session,  the  threshold  nwber  of  dots  required 
for  triangle  detection  in  the  various  colored  dot  fields  was  determined  (71X 
accuracy  level  with  500  dot  step  size,  determined  by  our  standard  sequential 
estimation  procedure).  The  threshold  determination  trials  for  the  red,  green, 
and  red-green-nlxture  dot  fields  were  randomly  Interleaved  in  the  sessions. 

The  experiment  consisted  of  the  two  triangle-size  sessions,  each  preceded 
by  a short  set  of  practice  trials.  This  experiment  took  about  an  hour.  Seven 
observers  participated. 

3.  Results 

The  threshold  number  of  dots  required  for  triangle  detection  (average 
across  observers)  for  the  two  triangle  sizes  as  a function  of  dot  colorings 
are  given  in  Table  5. 

TABLE  5.  THRESHOLD  DOT  NUMBER  FOR  VARIOUS  TRIANGLE 
SIZES  AND  DOT  COLORINGS 


Dot 

Color 


Red: 

Green: 

Red-Green  Mixture: 


Triangle  Size 

SMALL 

LARGE 

2678 

1857 

2571 

1809 

2643 

2274 

The  difference  for  either  size  of  target  pattern,  whether  all-red  or  all- 
green  dots  were  used,  was  not  significant.  The  use  of  a monochrome  dot  field 
does  not  introduce  a dimension  of  color  variation,  hence  no  difference  is  to 
be  expected  regardless  of  whether  all-red  or  all-green  dots  are  used. 

For  the  red  or  green  monochrome  dot  fields,  the  threshold  dot  density  for 
pattern  detection  is  expected  to  increase  as  triangle  size  decreases.  However, 
the  observed  Increase  in  density  threshold  is  somewhat  less  than  anticipated  if 
the  decision  processes  are  the  same  for  detecting  both  the  large  and  small  tri- 
angles. In  inspecting  the  dot  pattern  for  the  larger  triangle,  for  example. 
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one  would  assume  Che  observer  to  be  looking  for  some  critical,  triangle-defining 
dot-vacant  region.  Suppose  such  a potential  signal  region  has  area  A.  In  the 
noise  (slgnalfree)  portions  of  the  dot  field  a region  of  area  A has,  on  the 
average,  n dots  with  standard  error  (rms  fluctuation)  Sn.  Hence,  a region  of 
area  A in  the  noise  can  contain  n + a Jn  dots,  with  the  probability  of  a par- 
ticular dot  number  decreasing  as  a Increases.  The  reader  may  recognize  the 
expression  n + a as  the  confidence  interval  about  the  mean  value  n at  prob- 
ability level  determined  by  a.  We  may  assume  that,  in  trying  to  decide  whether 
what  looks  like  a dot free  region  is  not  a fluctuation  in  the  noiae,  the  observer 
adopts  a decision  criterion,  which  amounts  to  specifying  some  value  of  a In 
the  confidence  interval.  For  a dot-vacant  signal  region  to  be  distinguished 
from  a noise  fluctuation,  the  dot  density  must  be  sufficiently  high  so  that 
n - a Jn  > 0 for  the  observer-specified  value  of  a.  At  threshold  for  signal 
detection,  n - o * 0,  or  = a ■ constant. 

Suppose,  for  detection  of  the  larger  triangle,  the  noise  dot  density  is 
set  at  the  threshold  dot  density  of  nf  dots  per  region  of  area  A,  and  the 
smaller  triangle  is  the  signal  pattern  portrayed  in  the  dot  field.  For  the 
small  pattern,  the  triangle— defining  regions  of  Interest  will  have  area  A/2, 
which  contain,  on  the  average,  nfc/2  dots  in  the  noise.  Then  the  triangle  will 
he  below  the  detection  threshold.  For  the  smaller  triangle  to  become  apparent, 
the  number  of  dots  in  the  noise  field  should  be  double  the  number  necessary 
to  detect  the  larger  triangle.* 

However,  the  data  indicate  that,  in  going  from  the  larger  to  the  smaller 
triangle,  the  threshold  dot  number  increases  only  by  a factor  of  about  Jl, 
rather  than  a factor  of  2.  Hence,  if  the  decision  processes  are  the  same  for 
perceiving  the  triangles  of  either  size,  the  smaller  triangle  is  easier  to 
detect  than  would  be  expected  (or  the  larger  triangle  is  more  difficult  to  de- 
tect). One  may  conclude  that  the  decision-making  process  is  somewhat  different 
when  one  is  looking  for  the  smaller  triangle  than  when  one  is  looking  for  the 
larger  one. 

The  main  object  of  this  experiment  was  to  determine  if  pure  color  noise 
intrudes  into  a shape-recognition  task  that  depends  solely  on  the  processing 

*TMs  assumes  that  there  is  no  inherent  limitation  of  visual  processing  that 
prevents  the  inspection  area.  A,  to  be  as  large  as  the  larger  triangle. 
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of  the  It  ini  nance  profile  of  the  pattern.  Such  an  intrusion  effect  is  deraon- 
strated  for  the  larger  triangle.  The  number  of  dots  required  for  detection 
of  the  larger  triangle  in  the  monochrome  dot  field  is  1833  (average  of  red 
and  green  thresholds),  whereas  for  a red-green-aixture  dot  field  2274  dots 
are  required.  This  increase  in  threshold  dot  nunber  is  significant  (Wilcoxon 
signed— rank  test,  p < 0.05).  From  this  it  follows  that  the  dimension  of  color 
variation  may  intrude  into  shape-recognition  judgments.  This  outcome  is  con- 
sistent with  results  presented  in  Section  II. B (above)  that  pertained  to  brief 
exposures  for  word  and  letter  recognition.  Consequently,  if  one  makes  avail- 
able a dimension  of  color  variation  in  a display,  it  potentially  involves  it- 
self automatically  in  recognition  judgments  in  the  sense  that  no  attention  to 
its  presence  is  required;  indeed,  it  is  not  readily  disregarded. 

That  it  is  the  intrusion  of  the  dimension  of  color  variation  that  is  in- 
volved, and  not  simply  a tendency  to  attend  to  only  one  color  at  a time,  is 
seen  from  the  relationship  of  the  . number  thresholds  in  the  monochrome  and 
multicolored  situations.  Given  that  1833  dots  are  required  for  detection  of 
the  larger  triangle  in  & monochrome  dot  field,  one  would  expect  an  observer 
who  simply  attends  to  one  color  (red  or  green)  in  picking  out  t'.e  large  tri- 
angle in  the  multicolored  dot  field,  to  require  about  3666  dcts  for  detection, 
i.e.,  1833  dots  in  each  color.  Since  only  2274  dots  are  required,  a strict 
color-attention  hypothesis  is  rejected. 

While  a color  Intrusion  influence  Is  found  for  detection  of  the  larger 
triangle,  it  is  absent  fcr  the  smaller  triangle.  About  2625  dots  were  required 

for  perception  of  the  smaller  triangle  in  the  monochrome  dot  fields  (average 

* 

of  red-  and  green-field  thresholds)  and  about  2643  dots  were  required  for  scall- 
trlangle  detection  in  the  multicolored  dot  field.  There  is  no  significant 
difference  between  the  dot  number  thresholds  for  the  monochrome  and  multicolored 
dot  fields  lnsofaf  as  detection  of  the  smaller  triangle  is  concerned.  All  sub- 
jects, when  asked  after  completion  of  the  experiment  if  they  were  aware  of  the 
multicolored  nature  of  the  red-green-mixture  dot  fields,  reported  that 
the  multicoloring  was  quite  apparent.  Yet,  the  visible  color  /ariatlon  dimen- 
sion did  uot  intrude  into  shape-recognition  judgments  for  the  smaller  tri- 
angle. 

As  was  pointed  out  in  the  discussion  of  triangle  detection  results  for 
monochrome  dot  fields,  there  is  reason  to  suspect  that  visual  processing  for 
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perception  of  the  smaller  triangle  is  somewhat  different  from  that  for  the 
larger  triangle.  This  is  suggested  by  the  relatively  enhanced  detectability 
of  the  smaller  triangle  as  compared  to  what  is  expected  from  application  of 
signal-detection-theory  considerations.  It  appears  likely  that  the  relative- 
ly enhanced  detectability  of  the  smaller  triangle  in  monochrome  dot  fields  is 
related  to  the  lack  of  color  intrusion  for  the  shape  detection  in  the  multi- 
colored dot  field.  In  particular,  a likely  source  of  these  effects  is  that 
in  processing  for  detection  of  the  smaller  triangle  the  dot  field  is  globally 
processed  as  a unitary  texture,  in  contrast  to  a local  analysis  for  dot-field 
regularities  which  is  presumably  carried  our.  for  detection  of  the  larger  tri- 
angle. 

That  unitary  texture  processing  con  have  dramatic  influence  is  demon- 
strated by  an  effect  we  have  produced  using  contrast  reversal  on  a triangle- 
portraying  dot  field.  In  Fig.  12  we  show  the  same  dot  field  portraying  the 
triangle,  one  field  with  white  dots  on  black  background  and  the  other  field 
with  black  dots  on  white  background.  It  takes  some  adjustment  of  viewing  dis- 
tance, and  some  inspection  time,  to  perceive  the  triangle  in  either  dot  pattern, 
but  in  any  case  the  portrayed  shape  is  decidedly  more  salient  in  the  white- 
do  ts-on-b  lack-background  dot  field.  The  explanation  of  this  effect  would  take 
us  too  far  afield  from  the  current  discussion  [26],  The  point  we  want  to 
make  is  that  this  effect  is  a consequence  of  global  texture  processing  and 
serves  to  illustrate  that  the  consequence  of  such  processing  is  not  necessarily 
subtle. 

If,  in  processing  the  multicolored  dot  field  for  detection  of  the  smaller 
triangle,  the  dots  are  dealt  with  as  a unitary  texture,  it  is  necessary  for  the 
observer  to  cognitively  reject  color  differences  that  ai  visible  so  as  to 
perceptually  tie  together  all  dots  as  members  of  the  d<  texture.  If  this  is 
accomplished,  then  we  would  expect  color  variations  in  . 1 dot  field  to  be  of 

no  consequence.  That  is,  detectability  of  the  triangular  si.ape  should  be  no 
less  difficult  in  the  multicolored  dot  pattern  than  ia  the  monochrome  dot 
pattern.  Furthermore,  we  might  expect  this  global,  interactive  type  of 
processing,  which  emphasizes  cooperative  influence  among  picture  elements 
(dots) , to  yield  better  detectability  than  would  be  accomplished  by  a linear 
detector  process  of  the  type  outlined  in  the  statistical  decision  model  de- 
scribed earlier.  Why  this  global  texture  processing,  with  its  probable  effect 
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of  preventing  color  intrusion  and  its  probable  enhancement  of  shape  detection, 
should  be  evident  for  the  small- triangle  detection  task  and  not  for  the  larger 
triangle  is  a fundamental  question  that  must  be  considered  a topic  for  separate 
investigation. 

However,  we  can  say  with  some  confidence  that  there  are  available  to  the 
observer  local  and  global  modes  of  visual  processing  for  shape  recognition. 

For  local  modes  of  processing,  straightforward  statistical  decision-theory 
models  of  the  shape-detection  process  should  be  appropriate.  Of  particular 
significance  in  local  processing  is  the  fact  that  intrusion  of  information 
available  In  a color  dimension  is  expected  for  decision  making  about  shape. 
Whether  this  yields  a benefit  or  a deficit  in  performance  presumably  depends 
on  whether  the  color  information  is  relevant  (as  a redundant  code)  or  irrel- 
evant (as  distracting  noise) . For  global  inodes  of  processing,  it  appears  that 
one  can  expect  a somewhat  enhanced  recognition  performance  as  compared  to  the 
local  mode,  but  then  the  availability  of  a color  dimension  would  seem  to  serve 
no  purpose. 

E.  CONCLUSIONS 

Our  approach  to  determining  whether  a performance  benefit,  in  terms  of 
stimulus  recognition,  results  from  the  color  coding  of  shape,  has  been  to  look 
for  some  guiding  principles,  rather  than  find  specific  answers  to  specific 
problems.  The  principles  that  appear  to  follow  from  our  work  are  these:  When 

color  and  shape  are  treated  as  integral  dimensions,  or  when  the  stimulus  is 
locally  processed,  a benefit  from  stimulus  color  coding  can  be  expected.  When 
color  and  shape  are  treated  as  separable  dimensions,  or  when  the  stimulus  is 
globally  processed,  the  availability  of  color  coding  may  be  irrelevant. 

Hence,  a universal  answer  to  the  question  concerning  the  utility  of  color 
for  pattern  recognition  is  not  expected.  However,  if  one  can  determine,  for 
a particular  display-observer  situation,  the  likely  information  processing 
n»de,  in  terms  of  integrality  vs  separability  of  color  and  shape,  or  in  terms 
of  local  vs  global  stimulus  processing,  a statement  about  the  likelihood  of 
color  benefit  can  be  made. 

While  our  experiments  do  not  allow  us  to  characterize  generally  when 
integral  dimension  handling  or  when  local  processing  will  occur,  we  can  make 
some  reasonable  inferences.  We  found,  for  single  shape  discrimination  (Section 
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II. G),  Chat  color  did  not  enter  Into  Che  discrimination  judgments,  whereas  for  : 

word  and  letter  recognition  (Section  II. B)  color  could  not  be  disregarded. 

This  suggests  that  the  more  complex  the  information  to  be  processed,  the  more 
likely  it  is  that  color  and  shape  will  be  handled  as  integral  dimensions. 

Using  the  Advanced  Integrated  Display  System  (AIDS)  as  an  example,  one 
might  expect  chat  color  would  be  relatively  unimportant  for  the  right-hand 
engine-management  screen,  since  it  consists  of  relatively  simple  shapes  ar- 
ranged in  a manner  that  places  a minimal  processing  burden  on  the  pilot.  On 
the  other  hand,  more  complex  displays,  such  as  the  lower  moving-map  or  the 
upper,  vertical-situation  display,  involve  more  them  cursory  simple-shape  pro- 
cessing atid  might,  therefore,  benefit  more  from  color  coding. 

Roughly  speaking,  one  can  distinguish  local  and  global  processing  based 
on  whether  the  observer  inspects  display  details  (local  processing)  or  tends 
to  take  in  the  display  as  a whole  (global  processing).  The  engine-managemenc 
display  of  the  AIDS  provides  an  example.  The  intent  of  this  display  is  to  allow 
the  pilot  to  avoid  inspecting  display  details,  but  rather  glance  at  the  display 
for  malfunction  indication  - such  as  misaligned  arrows.  One  may  say  that  the 
display  is  meant  to  be  globally,  rather  than  locally,  processed.  In  this  case 
one  would,  therefore,  anticipate  a negligible  benefit  from  color  coding.  On 
the  other  hand,  the  vertical-situation  display  is  used  for  local  processing, 
as  when  the  signal  carat  indicates  an  item  of  interest.  Here  one  would  expect 
a benefit  of  color. 


III.  STATISTICAL  PROPERTIES  OF  THE  LUMINANCE  AND  CHROMINANCE 
VARIATIONS  IN  NATURAL  SCENES 


A.  INTRODUCTION 

Our  approach  In  describing  the  performance  of  displays  has  been  to  treat 
the  display-observer  system  as  elements  of  a noisy  communication  channel. 
Generally,  in  order  to  maximize  the  amount  of  received  information*  and  minimize 
the  display  cost,  complexity,  size,  and  other  factors,  an  attempt  is  made  to 
match  the  performance  of  the  display-observer  system  to  the  properties  of  the 
Information  to  be  transmitted.  However,  in  most  practical  systems  a unique 
characterization  of  the  information  is  not  possible  because  a fraction  of  the 
information  transmitted  is  constantly  changing  in  an  unpredictable  manner,  and 
because  other  random  processes,  such  as  noise,  are  inevitably  present.  Thus, 
it  is  possible  only  to  describe  the  signal  characteristics  as  random  processes 
whose  statistically  averaged  properties  are  known,  or  can  be  estimated.  These 
average  properties  can  then  be  used  to  describe  the  most  likely  system  response 
for  a given  set  of  display-observer  parameters. 

In  the  development  of  the  display  descriptors  presented  in  TR1,  TR2,  and 
Section  V of  this  report  we  have  assumed,  based  on  preliminary  experimental 
evidence  presented  in  TR1,  that  statistical  estimates  for  images  can  be  proper- 
ly incorporated  into  a mathematical  framework  to  predict  accurately  the  per- 
formance of  the  display-observer  system.  In  Section  IV  further  experimental 
results  that  support  this  assumption  will  be  presented. 

Our  mathematical  display  descriptors  require  that  the  ensemble-averaged 
power  spectral  densities  be  known  for  both  the  luminance  and  chrominance 
variations  in  natural  scenes.**  It  is  also  required  that  the  absolute  magni- 
tudes of  these  power  spectral  densities  be  known.  For  luminance  information, 

*The  form  of  the  ''information"  must  be  defined  according  to  the  specific 
requirements  of  the  channel.  In  both  TR1  and  TK2  this  issue  is  discussed 
in  detail  with  respect  to  displayed  information. 

**By  "natural  scenes"  we  mean  those  pictorial  scenes  that  are  of  general  in- 
terest to  nonspecialized  observers.  This  definition  loosely  defines  one  of 
many  possible  subsets  of  pictorial  information.  Other  subsets,  for  example, 
are  alphanumerics,  line  drawings,  the  art  work  of  Jackson  Pollock,  and  x-ray 
negatives.  Each  of  these  subsets  of  information  has  unique  statistical 
properties  that  can  be  incorporated  into  our  display  formalisms. 
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the  scale  of  modulation  depth  was  determined  by  measuring  the  ensemble-averaged 
ratio  of  the  rms  luminance  variations  to  the  average  luminance  for  a large 
number  of  Individual  scenes.  For  chrominance  information  the  ensemble-averaged 
rms  value  for  chrominance  variations  was  measured  in  a calibrated  system  as 
the  fraction  of  color  saturation.  These  quantities  are  discussed  in  more  detail 
In  Section  V. 

In  a strict  sense  It  is  possible  to  determine  meaningful  average  proper- 
ties for  random  processes  only  when  the  signals  are  stationary  [27].  Although 
the  signal  sources  discussed  in  this  report  are  certainly  not  stationary,  it 
will  be  shown  that  there  is  sufficient  self-similarity  among  the  scenes  studied 
to  warrant  strong  general  conclusions  to  be  drawn  about  their  properties. 

B.  MEASUREMENT  TECHNIQUES 

Our  general  approach  was  to  utilize  as  the  signal  sources  the  electron- 
ically generated  luminance  and  chrominance  signals  both  from  off-the-slr 
co tuner cial  television  and  from  a three-vidicon  television  camera  viewing  35-mm 
color  slides.  Television  represents  an  appropriate  means  of  producing  these 
signals  because,  as  part  of  the  encoding  process  for  transmission,  it  separates 
the  pictorial  information  into  three  parts:  a luminance  signal,  and  two, 

roughly  orthogonal,  constant  luminance  chrominance  signals.  It  will  be  shown 
in  Section  IV  that  signals  of  this  form  are  required  for  a complete  description 
of  the  information-handling  capabilities  of  the  human  visual  system. 

The  chromatid ty  signals  we  examined  are  shown  in  Fig.  13  as  lines  1^,  and 
Qc  on  a CIE  chromaticity  diagram.  The  solid  straight  lines  show  the  chromaticity 
paths  for  systems  with  unity  ganma.  The  curved  dotted  lines  show  the  chroma- 
ticity paths  for  transmitted  signals  with  a gamma  of  1/2,2.  In  both  cases  the 
Ic  and  Qj,  axes  pass  through  the  white  point  on  the  chromaticity  diagram.  Also 
shcra  cn  Fig.  13  are  the  NTSC  red,  green,  and  blue  primaries.  The  saturation 
values  given  in  this  report  are  defined  as  percentages  of  these  coordinates. 

The  chrominance  signals  were  obtained  by  demodulating  the  appropriate 
video  signals  with  a commercial  vectorscope.  This  device  has  the  advantage  of 
allowing  any  chrominance  axes  through  the  white  point  to  be  easily  selected. 
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Figure  13.  CIE  chromaticity  diagram  showing  the  locations  of  the  chroma- 
ticity  axes  Iq  and  Qc-  The  lines  through  the  white  point 
represent  chrominance  paths  for  signals  with  unity  gamma;  the 
dotted  lines  through  the  white  point  Indicate  the  chrominance 
paths  for  a gamma  of  1/2.2.  Also  shown  in  this  figure  are 
the  NTSC  color  primaries. 


The  Ic  and  Qc  axes  were  chosen  for  two  reasons.  First,  these  are  the  axes 
used  in  commercial  television  for  transmitting  the  chrominance  signals.* 

Second,  during  preliminary  experiments  it  was  found  that  the  power  distribu- 
tion about  the  white  point  of  Fig.  13  was  highly  anisotropic.  The  measured 
power  distribution  was  roughly  that  of  an  ellipse  with  the  major  axis  along 
Ic  and  the  minor  axis  along  Qc<  Thus,  it  was  necessary  to  measure  only  the 
statistical  properties  along  these  two  directions,  since  statistical  estimates 
for  other  directions  can  be  made  from  these  values.  The  bandwidths  for  both 
the  Ic  and 
luminance  signals  was  3 MHz  (->3  dB). 

For  both  the  luminance  and  chrominance  signals,  all  aspects  of  the  wave- 
forms that  were  not  part  of  the  actual  scenes,  such  as  color  burst,  sync,  and 
blanking,  were  removed.  Further,  the  total  luminance,  1^,  and  chrominance,  C^, 

*In  the  television  literature  they  are  known  simply  as  the  Z and  Q 
channels  [28]. 


Q„  axes  were  approximately  0.5  MHz  (-3  dB);  the  bandwidth  for  the 
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Figure  14.  The  luminance  and  chrominance  signals  were  decomposed  Into 
an  average  term_plua  a modulation  term.  That  la,  IT  - 
I + CT  I - Cj  + Cnl,  and  C^q  * Cq  + C^q,  where  1 is 
the  average  luminance  for  the  scene  and  Cj  and  Cq  are  the 
average  percent  saturations  for  the  scene  along  the  1q  and 
Qq  axes,  respectively.  The  properties  of  the  average  terms 
and  the  modulation  terms  were  measured  separately,  as  shown. 


signals  were  decomposed  into  their  respective  average  values,  I and  C,  and 

modulation  terms,  I and  C . This  operation  is  shown  schematically  in  Fig.  14. 

9i  a 

S-»(L  + C . or  C.  . • CT  + C depending  on  whether  we  are  referring  to 
the  Qc  or  Ic  axes,  respectively. 

All  of  the  power  spectral  densities  in  this  report  were  obtained  from 

the  modulation  terms  I , C and  C _ . This  approach  has  the  important  ad- 

m m,w  o,i 

vantage  of  minimizing  the  effects  of  the  limited  size  of  the  scanned  "window” 
of  a scene  on  the  measured  power  spectral  densities.  This  is  because  the  power 
spectral  densities  for  the  I,  C_,  and  C_  terms  are  of  the  form  sine  (iow/2) 

H * 2 

(where  w is  the  picture  width),  which  rolls  off  at  high  frequencies  as  l/u> 

(an  example  of  this  spectrum  is  shown  in  Fig.  19) . Since  for  scenes  with 
relatively  small  modulation  depths  the  power  spectra  from  the  average  terms 
can  mask  details  in  the  modulation  power  spectra,  it  is  important  that  the 
average  term  be  removed  before  the  power  spectra  are  obtained. 

For  both  the  luminance  and  chrominance  signals  the  video  bandwidths  and 
signal-to-noise  ratios  were  sufficiently  high  for  their  effects  on  the  measured 
statistical  signal  properties  to  be  negligible.  The  input  signal  amplitudes 
were  normalized  by  maintaining  a constant-peak-white-signal  amplitude  and  a 
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constant-color-reference  amplitude.  The  system  gamma  was  approximately  1/2.2 
for  the  commercially  transmitted  signals  and  approximately  1.0  for  the  35-nmn 
slides  used  in  this  study.  The  actual  gamma  of  the  television  camera  used  to 
view  the  35-B*n  slides  was  0.6,  but  because  the  gasman  for  the  color  slides 
varied  between  roughly  1.4  and  2.2,  the  effective  gama  of  the  overall  system 
was  close  to  1.0.  The  effect  of  the  different  gammas  on  the  chromatlclty  re- 
sults can  be  seen  in  Fig.  13.*  Because  the  average  saturations  of  the  images 
studied  were  relatively  small,  and  because  the  perceptual  space  approximated 
by  the  CIE  diagram  is  highly  nonlinear,  we  conclude  that  the  effects  of  the 
different  gammas  on  the  measured  chromatlclty  results  are  insignificant. 

The  measurements  were  performed  with  the  apparatus  shown  schematically 
in  Fig.  14.  For  each  image  studied  the  amplitude  of  the  average  tern  (either 
I,  Cq,  or  Cj)  was  determined  by  adjusting  its  value  until  the  reading  on  the 


The  average  rms  value  (either 
) was  then  read  from  the  appropriately  calibrated  dial. 


rms  voltmeter  was  a minimum 

ft. 


ft  VI, 


or 


The  power  spectra  were  obtained  with  an  ac— coupled  video  spectrum  analyzer. 
The  spectra  produced  by  a television  video  signal  consist  primarily  of  terms 
at  discrete  frequencies  that  are  integer  multiples  of  the  horizontal  line  fre- 
quency (*  15.8  kHz).  Thus,  the  power  spectra  presented  are  limited  at  the 
low  end  to  a frequency  proportional  to  the  inverse  of  the  picture  width,  and 
at  the  high  end  by  the  bandwidth  and  signal- to-noise  ratio  of  the  system. 
Additional  details  about  the  properties  of  video  spectra  can  be  found  in 


Ref.  [29]. 

The  35-m  color  slides  used  in  this  study  were  selected  to  represent  a 
diverse  range  of  subject  material.  We  assume  that  the  quality  of  the  slides 
was  sufficiently  high  for  the  imperfections  in  the  slides  not  to  affect  our 
results  significantly,  although  it  is  apparent  that  the  saturation  and  con- 
trast of  actual  images  would  exceed  those  of  our  slides. 

For  convenience,  the  35-mm  slides  were  organized  into  four  categories: 
single-object  scenes,  two-object  scenes,  three-to-ten-object  scenes,  and  crowd 
scenes.  These  classifications  represent,  roughly,  the  number  of  prominent 


♦All  the  luminance  measurements  were  performed  with  35-am  slides. 
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objects  In  the  scenes.  It  should  be  emphasized  that  these  classifications  are 
only  approximate  and  that,  especially  for  the  chrominance  results,  the  power 
spectra  should  be  inspected  individually. 

C.  RESULTS 

1.  Average  ms  Modulation  Depth  for  Luminance  Information 

The  results  of  our  measurements  on  the  normalized  rms  modulation  depth 
for  luminance  variations  in  natural  Images  is  shown  in  Table  6.  Also  shown, 
as  entry  Mo.  34,  is  the  result  for  a page  of  newsprint.  The  measured  results 
have  all  been  normalized  by  dlv4  rms  luminance  values  for  each  scene 

by  their  respective  average  luminance  values. 

As  indicated  in  the  table,  the  normalized  rms  modulation  depths  for  the 
individual  slides  studied  ranged  between  0.28  and  0.87,  with  an  average  value 
of  0.56  (excluding  slide  Mo.  34).  The  average  value  indicates  that  natural 
scenes  are  highly  modulated.  For  comparison  consider  a scene  composed  of 
alternate  black-and-white  strips  of  equal  width.  The  fractional  rms  modula- 
tion depth  for  this  scene  is  1.0. 

In  observing  the  video  waveforms  of  these  ana  many  other  scenes  it  was 
noted  that  the  actual  distribution  of  luminance  levels  is  not  uniform  from 
black  to  maximum  white.  In  general,  the  highlight  portions  of  the  scenes 
occupied  a smaller  fraction  of  the  total  area  of  the  scenes  than  did  the  low- 
lights.  By  Inspection  It  was  estimated  that  the  highlight  amplitudes  were 
roughly  three- to— four  times  the  average  luminance  amplitudes . This  observation 
explains,  in  part,  the  large  fractional  modulation  depths  reported  above. 

Finally,  the  fractional  rms  modulation  depth  was  measured  for  a page  of 
newsprint  and  found  to  be  0.15.  This  value  is  considerably  lower  than  that 
for  any  of  the  natural  images  studied.  It  is  due  to  the  relatively  infrequent 
occurrence  of  printed  alphanumeric  figures. 

2.  Luminance  Power  Spectral  Density  Measurements 

The  measured  luminance  power  spectra  data  are  shown  in  Figs.  15-18.  The 
numbers  on  the  figures  corresponding  to  specific  data  points  refer  to  their 
respective  scene  description  in  Table  7.  The  average  results  for  each  of  the 
four  general  classes  of  images  is  shown  in  Fig.  19. 
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TABLE  6.  LIST  OF  SCENES  AND  THEIR  RESPECTIVE  NORMALIZED 
ru  LUMINANCE  MODULATION  DEPTHS 


Mo. 

Description 

in 

X 

Aperteent  building 

0.87 

2 

Girl  (tending  In  roo* 

0.86 

3 

Girl  with  RCA  sign 

0.86 

4 

Girl  witching  TV 

0.80 

5 

Aeroplane  end  two  people 

0.76 

6 

Lady  with  checker  board 

0.76 

7 

Girl  on  atriped  blanket 

0.72 

8 

Four  people  on  beach 

0.69 

9 

Man  and  voato  standing 

0.66 

10 

Man  and  aeroplane 

0.66 

11 

Red  zinnia 

0.66 

12 

Girl  and  dotted  background 

0.63 

13 

Crowd  of  Indians 

0.60 

14 

Ten  people 

0.S9 

15 

Girl  and  duck 

0.57 

16 

Man  and  vaaan  in  rooa 

0.56 

17 

Power  lines 

0.51 

18 

Girl  In  country 

0.50 

19 

Stadiuw  crowd 

0.47 

20 

Fruit  basket 

0.47 

21 

Motel  sign 

0.47 

22 

Standing  girl 

0.47 

23 

Bead  of  blond  lady 

0.45 

24 

Face  of  young  girl 

0.44 

25 

Manikin 

0.43 

26 

Girl  and  tree 

0.42 

27 

Soap  boa 

0.42 

28 

Lady  In  kitchen 

0.41 

29 

Fruit  basket 

0.39 

30 

Dog  on  graaa 

0.35 

31 

Four  people 

0.34 

32 

Aeroplane  «nd  noun tains 

0.31 

33 

Bead  of  girl 

0.28 

Avar age 

0.56 

34 

Page  of  newsprint 

0.15 
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Figure  IS.  Average  horizontal  luminance-modulation  power  spectra 

for  individual  scenes  that  contain  one  prominent  object. 
The  n ushers  next  to  the  symbols  in  the  legend  correspond 
to  the  scene  numbers  of  Table  6. 
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Figure  16.  Average  horizontal  luminance-modulation  power  spectra  for 
individual  scenes  that  contain  two  prominent  objects.  The 
numbers  next  to  the  symbols  in  the  legend  correspond  to 
the  scene  numbers  of  Table  6, 
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Figure  17.  Average  horizontal  luminance-modulation  power  spectra  for 
individual  scenes  that  contain  three- to-ten  prominent 
objects.  The  numbers  next  to  the  symbols  in  the  legend 
correspond  to  the  scene  numbers  of  Table  6. 
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Figure  18.  Average  horizontal  luminance-modulation  power  spectra  for 
individual  scenes  that  contain  more  than  ten  prominent 
objects.  The  makers  next  to  the  symbols  in  the  legend 
correspond  to  the  scene  numbers  of  Table  6. 


TABLE  7.  LIST  OF  SCENES  USED  TO  DETERMINE 
THE  LUMINANCE  POWER  SPECTRA 


Types  of  Scene 
Single— Object  Scenes 


Two-Object  Scenes 


Three- to— Ten-Object  Scenes 


Crowd  Scenes 


No. 

1 

2 

3 

4 

5 

6 

7 

8 

1 

2 

3 

4 

5 

6 

7 

8 
9 

1 

2 

3 

4 

5 

6 

7 

8 

1 

2 

3 

4 


Description 

Girl  in  hat 
Red  zinnia 
Aparcment  building 
Head  of  girl 
Manikin 
Soap  box 
Head  of  girl 
Head  of  young  girl 

Child  in  sandbox 
Church  in  country 
Aeroplane  and  two  people 
Girl  on  blanket 
Santa  doll  and  toy 
Girl  with  RCA  sign 
Girl  with  duck 
Girl  with  t. uni  lower 
Man  and  woman 

Nine  people  in  a row 

Two  people  in  a room 

Table  with  fruit 

Group  of  flowers 

Beach  scene  with  four  people 

Fruit  bowl 

Girl  watching  TV 

Lady  in  a room 

Stadium  crowd 
Field  of  tulips 
Growl  of  Indians 
Line  drawing  with  type 
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Figure  19.  Luminance-modulation  power  spectra  obtained  from  the 

average  results  of  Figs.  15  through  18.  At  high  display 
frequencies,  all  spectra  roll  off  as  l/w^.  Also  plotted 
on  this  figure  is  the  spectrum  (vertical  scale  arbitrary) 
for  the  average  luminance  term  I. 

The  data  points  of  Fig.  19  support  the  general  conclusion  of  TR1  that  at 

2 

high  frequencies  the  power  spectra  for  natural  scenes  rolls  off  as  l/w  . As 
expected,  however.  Fig.  19  also  indicates  that  the  lowest  frequency  for  which 
this  relationship  applies  is  a function  of  the  scene  content.  As  the  scenes 
increase  in  complexity,  from  single-object  scenes  to  scenes  that  contain  many 
prominent  objects,  the  frequency  above  which  the  spectra  roll  off  as  1/w  be- 
comes progressively  higher.  This  transition  point,  or  low-frequency  cutoff, 
uj  , is  roughly  proportional  to  the  inverse  of  the  dominant  length  scale  of 
the  images  under  investigation.  For  off-the-air  video  we  have  found  that  the 
cutoff  frequency  is  well  approximated  by  a>  ■ 2Tr/picture  width  (TR1).* 

“ 2 

As  described  in  Till,  we  feel  that  the  1/w  roll-off  strongly  suggests 
that  edge  transitions  represent  a significant  feature  of  natural  scenes.  This 
conclusion  follows  from  the  fact  that  the  power  spectra  for  an  enseufcle  of 
edgeB,  randomly  placed  and  with  random  amplitudes,  will  also  have  a high-fre- 
quency  roll-off  of  the  form  1/u  . The  significance  of  edges  can  also  be  con- 
vincingly established  by  observation  of  one’s  environment  or  by  viewing  on  an 

*Note,  however,  that  for  all  imaging  systems  the  power  spectra  must  nave 
their  maximum  value  at  dc  [25}. 
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oscilloscope  Che  waveforms  of  luminance  signals  from  a television  camera.  In 
general,  textural  variations  are  of  much  lower  modulation  depth  than  are  edge 
transitions. 
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3.  Average  rms  Modulation  Depth  for  Chrominance  Information 

Our  findings  on  th;  average  rms  modulation  depths  for  chrominance  in- 
formation are  summarized  in  Table  8.  The  averages  shown  in  this  table  for 
36  slides  were  obtained,  with  few  exceptions,  from  the  list  of  slides  given 
in  Table  9.  The  off-the-air  scenes  were  chosen  to  illustrate  the  range  of 
possible  chrominance  values  experienced  on  commercial  television.  They  are 
not  Intended  to  represent  an  ensemble  average  of  all  scenes  that  are  broad- 
cast. Each  entry  in  the  table,  for  off-the-air  scenes,  is  the  average  of 
approximately  15-mln  segments.  It  should  be  realized  that  the  saturations  of 
individual  scenes  within  this  averaging  period  varied  from  practically  zero 
to  almost  100Z. 

Several  general  conclusions  may  be  drawn  from  the  results  presented  in 
Table  8.  First,  for  natural  scenes  the  rms  distribution  of  saturations  about 
the  white  point  is  highly  anisotropic.  Typically  the  rms  saturation  along  the 
Ic  axis  was  three  times  that  of  the  rms  saturation  along  the  Qc  axis.  Second, 
we  note  from  the  table  that  individual  scenes  have  an  average  saturation  com- 
parable to  their  chrominance  rms  modulation  depths,  but  that  for  a large 
number  of  scenes  the  average  chromaticity  tends  more  toward  white  (zero  satura- 
tion). This  fact  represents  a major  difference  between  the  properties  of 
luminance  and  chrominance  information.  For  an  ensemble  of  images  the  average 
luminance  must  be  a large  positive  value,  but  for  chrominance  information  the 
average  saturation  can  be  zero.  From  the  entries  in  Table  8 we  find  the 
average  saturation  along  the  axis  to  be  effectively  rtero,  whereas  the  average 
saturation  along  the  I axis  varies  between  3.5  and  15Z  (toward  orange).  Thus, 

u 

for  this  ensemble  of  images  the  average  hue  is  slightly  orange.  This  result 
is  a reflection  of  the  fact  that  in  manmade  objects  red,  orange,  and  yellow 
are  used  freely  to  add  '‘warmth"  and  "excitement."  Third,  we  see  from  the  re- 
sults presented  in  Table  8 that,  except  for  game  shows  and  similarly  garish 
settings,  the  average  rms  saturation  for  chrominance  variations  is  quite  small. 
We  suspect  that  for  the  majority  of  non-manmade  objects  the  average  saturation 
would  be  even  less. 
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TABLE  8.  SUMMARY  OF  RESULTS  ON  THE  ms  MODULATION 
DEPTH  OF  CHROMINANCE  INFORMATION 


I.  Average  of  36  Slides 


Ig  Axis 


< C > - 3.5% 

< |C|  > - 7.7% 


% Axis 


< C > - 0.7Z 

< |c|  > « 2.2% 


II.  Of f -Che-Air  Video  (All  Entries  Represent  Approximately  15-min  Averages) 


DESCRIPTION 

xc 

Axis 

Axis 

C 

^fc2 

C 

T m 

1 m 

Came  Show 

(Video 

Tape) 

23.0% 

15.0%  (orange) 

7.5% 

2.3%  (magenta) 

Game  Show 

29.0% 

13.0% 

16.6% 

0.8%  (green) 

Talk  Show 

21.0% 

12.0% 

6.7% 

2.5% 

Talk  Show 

8.0% 

11.0% 

5.0% 

* 0 

Soap  Opera 

9.5% 

3.4% 

2.9% 

0.5% 

Soap  Opera 

8.3% 

6.8% 

3 6% 

2!  0 

Archie  Bunker 

(Film) 

7.0% 

9.2% 

2.6% 

1.1% 

Note:  All  values  are  given  as  percent  of  saturation. 
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TABLE  9.  LIST  OF  SCENES  USED  TO  DETERMINE 
THE  CHROMINANCE  POWER  SPECTRA 


E ■ 

Tv ix*  of  Scene 

Mo. 

Description 

ri 

i - 

Single-Object  Scenes 

1 

Red  zinnia 

j 

2 

Indian  girl 

; 

3 

Soap  box 

:• 

4 

Girl  In  tall  grass 

3 

Head  of  blond  lady 

t 

6 

Manikin 

' 'I 

7 

Lady 

8 

Girl  and  scarf 

i.  .% 

Two-Object  Scenes 

1 

Santa  doll  with  toy 

-7; 

> 

2 

Aeroplane  and  two  people 

»- 

3 

Girl  on  blanket 

i 

4 

Gli 1 watching  TV 

— 

5 

Girl  and  sunflower 

6 

Child  in  sandbox 

1 t; 

7 

Flowers  in  Vase 

r 

*• 

8 

Girl  with  RCA  r.ign 

r' 

. i 

9 

Man  and  wostan  standing 

• : 

t *■ 

10 

Lady  and  Boap  box 

f.  -■ 

1 a 

11 

Girl  in  country 

t 

l 

Three— to-Ten-Obj  ecta  Scenes 

1 

Four  people 

F i 

i 

2 

Lady  in  kitchen 

| ,3r. 

3 

Man  and  wen  an  in  room 

[ y 

4 

Lady  in  room 

i'  ?Jt 

5 

Two  people  in  rooa 

f i. 

6 

Fruit  and  table  setting 

[ %. 

7 

Boy  with  kite 

t kt 

8 

Fruit  basket 

9 

Beach  scene  with  four  peopl 

I 

Crowd  Scenes 

1 

Building 

••  f 

2 

Ten  people 

1 1 

3 

Crowd  of  Indians 

I ; 

4 

Stadiua  crowd 

r i 

j ** 1 

5 

Tulip  field 

f • 

6 

Country  scene  in  fall 

4.  Chrominance  Power  Spectral  Density  Measurements 

The  individual  power  spectral  densities  for  the  34  slides  listed  in 
Table  9 are  presented  in  Figs.  20,  21,  22,  and  23  for  chrominance  signals 
along  the  1^  axis,  and  in  Figs.  24,  25,  26,  and  27  for  chrominance  signals 
along  the  Qc  axis.  It  should  be  emphasized  that  the  classification  of  these 
images  into  four  groups  was  performed  by  inspection  of  the  images.  Careful 
examination  of  the  individual  power  spectral  densities  indicates  considerable 
variation  between  slides  in  each  group. 
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Figure  20.  Individual  chrominance  power  spectra  for  the  j terms  of 
scenes  that  contain  one  prominent  object.  The  numbers  next 
to  the  symbols  in  the  legend  correspond  to  the  scene  numbers 
of  Table  9. 

In  Fig.  28  the  average  power  spectral  densities  for  all  four  classifica- 
tions of  images,  and  for  both  the  1^,  and  axes,  are  presented.  For  both  the 
Ip  and  Q axes,  these  results  reflect  the  amount  of  structure  in  the  images  by 

V w 

the  degree  to  which  their  spectra  are  flattened  at  the  lower  display  fre- 
quencies. Also,  as  predicted  from  the  results  presented  in  Table  8,  the  power 
along  the  1^  axis  is  considerably  greater  than  that  along  the  axis.  Finally, 

we  note  that  the  power  spectra  for  frequencies  above  a lower  cutoff  frequency, 

2 

as  in  the  case  of  the  luminance  power  spectra,  roll  off  as  1/id  . This  result 
is  expected  due  to  the  distribution  of  colors  in  natural  scenes.  Transitions 
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Figure  23.  Individual  chrominance  power  spectra  for  the  Cn  terms 
of  scenes  that  contain  more  than  ten  prominent  objects. 
The  numbers  next  to  the  symbols  in  the  legend  correspond 
to  the  scene  numbers  of  Table  9. 
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Figure  24.  Individual  chrominance  power  spectra  for  the  Cjj  q terms 

of  scenes  that  contain  one  prominent  object.  T&e  numbers 
next  to  the  symbols  in  the  legend  correspond  to  the  scene 
numbers  of  Table  9. 
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Figure  25.  Individual  chrominance  power  spectra  for  the  Cm>Q  terms 

of  scenes  that  contain  two  prominent  objects.  The  numbers 
next  to  the  symbols  in  the  legend  correspond  to  the  scene 
numbers  of  Table  9. 
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Figure  26.  Individual  chrominance  power  spectra  for  the  C^q  terms  of 
scenes  that  contain  three-to-ten  prominent  objects.  The 
numbers  next  to  the  symbols  in  the  legend  correspond  to  the 
scene  numbers  of  Table  9. 


f roa  one  chrominance  value  to  another  almost  always  occur  suddenly.  Because 

the  power  spectral  densities  for  each  chroainance  transition  are  of  the  fora 
2 

1/u  , it  follows  that  an  enseable  average  of  scenes  coaposed  of  randomly  placed 

2 

edges  will  also  have  a power  spectral  density  of  the  fora  1/u  . Thus , we  find 
that  the  power  spectra  for  lianinance  and  chroainance  information  are  siailar  in 
fora. 


D.  SUMMARY  OF  RESULTS 

The  major  conclusions  of  this  study  are: 


(1)  0.3  * 


<#  >/i 


■v  1.0 


Average  of  34  slides  - 0.56. 

(2)  The  power  spectral  density  for  luminance  information  rolls 

2 

off  as  1/(d  . For  individual  scenes  with  a finite  sized 
aperture,  there  also  exists  a lower  cutoff  frequency  that 
la  determined  by  the  picture  width  and  scene  content.  The 
largest  term  in  the  power  spectral  density  is  always  at  dc. 

(3)  32  ^ <Cj>  £ 152  (toward  orange) 

Average  of  36  slides  * 3.52. 

(4)  0 ^ <CQ>  £ 22  (toward  green) 

Average  of  36  slides  * 0.72. 


<{j> 


(5)  82  *v 

Average  of  36  slides 
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9.52. 


(6)  22  'v 


102 


Average  of  36  slides  - 2.42. 


(7)  2 


•0*  <^Tr/ ‘ 5 5-° 


The  measured  values  along  the  1^  and  axes  represented, 
approximately,  the  directions  of  maximum  and  minimum  rms 
saturations. 
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(8)  The  ens emb le-ave raged  power  spectral  density  for  chrominance 

2 

Information  rolls  off  as  1/u  for  high  spatial  frequencies. 

It  tends  toward  zero  saturation  at  dc. 

2 

(9)  The  1/u  roll-off  in  the  power  spectra  for  both  luminance  and 
chrominance  information  is  in  agreement  with  the  assumption 
that  natural  scenes  are  primarily  composed  of  randomly  located 
edge  transitions. 


IV.  SUBJECTIVE  SHARPNESS  OF  DISPLAYED  IMAGES 


A.  INTRODUCTION 

For  the  proper  design  end  evaluation  of  displays  it  is  necessary  to  know 
the  relationship  between  the  objective  physical  variables  of  the  display  and 
the  subjective  perceptual  variables  of  the  observer.  One  of  the  Most  j^tor- 
tant  subjective  variables  is  image  sharpness.  In  this  section  we  will  present 
the  results  of  a series  of  experiments  that  relate  the  subjective  impression 
of  image  sharpness  to  the  modulation  transfer  function  (MTF)  of  the  display. 
Specifically,  we  have  determined  the  relationship  between  the  change  in  display 
MTF  necessary  to  produce  a just— noticeable  difference  (jnd)  in  image  sharpness. 

The  measurements  were  performed  on  a large,  high-brightness  display  with  still 
images,  both  monochrome  and  colored. 

The  only  previous  experiments  to  determine  a scale  of  subjective  -tmaga 
sharpness  were  performed  by  Baldwin  [30],  who  used  defocused  black-and-white 
35-mm  motion  picture  film.  For  small-screen  (18  cm  x 19  cm),  low-brightness 
( ^ 3 ®L)  Images  he  measured  the  jnd  in  image  sharpness  as  a function  of  the 
display  resolution.  Baldwin  found  that  the  jnd  in  image  sharpness  was  determined 
by  a constant  (0.003  degree)  change  in  the  linear  size  of  the  figure  of  confusion* 
of  the  project eu  images.  That  is,  he  found  that  as  the  images  became  sharper, 
a larger  percentage  change  in  resolution  was  required  to  produce  a constant 
change  in  image  sharpness.  Because  the  bandwidths  for  each  of  the  images 
used  by  Baldwin  were  greater  than  6 cycles /degree— of-vision  (well  past  the  peak 
of  the  visual  system's  MTF),  the  general  nature  of  his  results  is  consistent 
with  the  current  understanding  of  the  suprathreshold  spatial  frequency  response 
of  the  eye.  For  still  lower  spatial  frequencies,  however,  his  findings  are 
not  expected  to  apply.  This  is  because,  at  low  spatial  frequencies,  the 
response  of  the  eye  decreases.  Thus,  the  change  in  resolution  necessary  to 
produce  a jnd  in  image  sharpness  is  expected  to  increase  with  decreasing 
resolution. 

Since  the  early  work  of  Baldwin  (in  1940),  it  has  become  widely  recog- 
nized that  a more  appropriate  physical  variable  for  quantifying  image  sharpness 


The  figure  of  confusion  is  a rough  measure  of  the  width  of  the  image  point- 
spread  function. 


is  the  display  HTF,  and  not  simply  the  display  resolution.  For  exasp le,  Schade 
[4]  has  shown  that  subjective  sharpness  is  described  more  accurately  by  the 
noise  equivalent  bandwidth  (N^) , which  weights  a given  spatial  frequency  ac- 
cording to  the  square  of  the  system's  KTF  at  that  frequency.  Recently  we  have 
shown,  in  TR1  and  TR2,  that  the  noise  equivalent  bandwidth  of  Schade  can  be 
improved  by  generalizing  its  properties  to  the  perceptual  level.  We  call  this 
descriptor  the  visual  capacity,  C^,  because  it  is  the  information  theory  equiv- 
alent of  the  channel  capacity  for  a bilevel  transmission  system. 

Unfortunately  rhe  measurements  of  Baldwin  cannot  be  recast  in  terms  of 
our  current  understanding  of  image  sharpness  because  the  display  point-spread 
functions  reported  by  him  are  ill  defined.  Furthermore,  Baldwin  did  not  record 
the  MTFs  of  the  film  he  used  and,  although  he  Indicated  that  film  jitter  re- 
duced the  vertical  resolution  of  his  display,  he  did  not  quantify  the  resolution 
loss  resulting  from  this  cause.  All  of  these  reasons  make  it  impossible  to 
reconstruct  the  overall  MTFs  for  his  system. 

In  this  section  we  report  on  a series  of  experiments  which  relate  the 
subjective  impression  of  image  sharpness  to  the  KTF  of  the  display.  We  will 
show  that  our  measured  results  are  in  good  agreement  with  the  assumption  that 
a jnd  in  image  sharpness  is  determined  by  a constant  change  in  the  rms  gradient 
content  of  the  images  - a quantity  proportional  to  the  square  root  of  our  dis- 
play descriptor,  the  visual  capacity.  We  will  show  that  the  jnd  for  conven- 
tional Images  and  for  an  image  consisting  only  of  a single  luminance  transition 
are  practically  identical.  This  measurement  supports  our  contention  that  edges 
are  a significant  feature  of  natural  scenes  (see  alto  Section  III).  Finally, 
we  will  show  that  our  measurements  calibrate  the  visual  capacity.  By  this  we 
mean  that  it  is  now  possible,  with  the  visual  capacity  as  a normalizing  stimulus, 
to  specify  in  perceptual  terms  the  subjective  sharpness  of  displays  with  differ- 
ent MTFs. 

B.  EXPERIMENTAL  APPARATUS 
1.  Introduction 

The  rudiments  of  the  system  used  to  obtain  the  results  presented  in  this 
section  are  9hown  in  Fig.  29.  A variable-bandwidth,  low-pass,  spatial-frequency 
filter  composed  of  two  parallel  diffusive  plates  was  used  as  the  display.  The 
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Figure  29.  This  figure  shows  the  basic  elements  used  in  the  performance 
of  the  experiments.  Observers,  sitting  at  viewing  distances 
of  either  128  or  400  cm,  viewed  images  that  were  low-pass 
filtered  by  a display  composed  of  two  parallel  diffusive 
plates.  Images  were  produced  either  by  the  direct  projection 
of  35-mm  slides  or  by  the  back- illumination  of  images  placed 
over  the  Input  plane  of  the  display.  The  display  was  59— cm 
wide  by  4 3- cm  high;  the  average  screen  brightness  was  always 
35  mL. 


bandwidths  of  output  Images,  produced  by  either  direct  projection  of  35 -ran 
slides  or  by  back- illumination  of  images  placed  over  the  input  plane  of  the 
display,  were  controlled  by  the  separation,  d,  between  the  two  plates.  Ob- 
servers viewing  the  screen  were  asked  to  select  the  sharper  of  two  images, 
presented  successively,  that  differed  in  bandwidth  by  a small  amount.  At  dis- 
crete display  bandwidths,  distributed  throughout  the  bandpass  of  the  human 
visual  system,  the  change  in  bandwidth  required  to  produce  a jnd  in  image 
sharpness  was  measured.  Each  of  these  elements  will  be  discussed  in  detail 
in  the  succeeding  paragraphs. 

2.  Diffuser  Display 

In  Fig.  30  photographs  of  the  display  constructed  for  these  experiments 
are  shown.  The  display  v.  essentially  free  of  spatial-frequency  noise,  autd 
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Figure  30.  These  two  photographs  show  the  diffuser  display  constructed 
for  these  experiments.  The  top  photograph  shows  the  Input 
side  of  the  display  with  a superimposed  luminance  edge. 

The  bottom  photograpn  shows,  from  the  observer’s  position, 
the  same  edge  after  loitering  by  the  display. 
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introduced  no  spurious  coloration  into  the  displayed  images.  The  average  dis- 
play brightness  was  always  35  mL. 

The  bandwidth  of  the  display  was  varied  by  moving  the  output  image  dif- 
fuser plate  with  a motor-controller  whose  accuracy  was  better  than  + 0.003  cm. 
During  experiments  a masking  frame  was  placed  in  front  of  the  display  (the 
screen  is  not  shown  in  Fig,  30)  to  prevent  extraneous  visual  clues  caused  by 
the  motion  of  the  output  image  plate  from  being  detected  by  the  observers.  The 
free  aperture  of  the  display  was  59  cm  x 43  cm. 

The  electronic  circuitry  that  fed  the  motor-controller  allowed  access  to 
two  separate  plate  spaclngs  by  the  flipping  of  a switch  on  a small  hand-held 
box.  Thus,  an  observer  could  sequence  back  and  forth  between  two  images  that 
differed  in  bandwidth  by  a preset  amount.  Also  on  the  hand-held  box  were  two 
button  switches  that  corresponded  to  the  two  bandwidth  settings.  During  an 
experiment  the  observer  would  decide  which  of  the  two  images  he  believed  to 
be  the  sharper,  and  press  the  button  corresponding  to  his  choice.  If  his 
selection  was  correct,  a,  green  light  flashed;  if  his  selection  was  wrong,  a 
red  light  flashed.  After  his  selection  the  results  were  automatically  recorded, 
and  the  order  of  presentation  for  the  two  images  with  different  bandwidths  was 
randomized.  Therefore,  an  observer  who  pushed  only  one  of  the  buttons  on  the 
hand-held  box  would  guess,  on  the  average,  the  sharper  image' 50%  of  the  time. 

The  measured  MTF  for  the  uisplay  is  shown  in  Fig.  31  as  a function  of  the 
display  frequency  times  the  plate  separation,  fd.  The  solid  line  on  the  figure 
represents  the  analytical  approximation  for  R,  also  given  on  the  figure.  It 
may  be  seen  that  the  correspondence  between  the  measured  data  points  and  the 
approximated  curve  is  excellent.  Figure  31  also  shows  -that  at  all  plate  sepa- 
rations the  measured  MTFs  are  of  similar  form.  The  large  changes  in  MTF  shape 
that  characterize  lenses  at  different  defocus  positions  [25]  are  not  experi- 
enced with  this  system.  This  represents  an  important  advantage  in  these  experi- 
ments because  it  means  that  the  MTFs  can  be  specified  accurately  at  each  plate 
separation  value.  An  example  of  the  input-output  properties  for  this  display 
with  a luminance  edge  input  is  shown  in  Fig.  30. 

The  limiting  spatial  frequency  response  for  the  diffuser  display,  referred 
to  as  the  screen  MTF,  is  established  by  the  granularity  of  the  diffusive  material 
on  the  two  parallel  plates.  We  measured  the  screen  MTF  with  the  plate  separa- 
tion, d,  set  to  zero  and  found  it  to  be  equal  to  the  MTF  of  Fig.  31  with  a plate 
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fd  (CYCLES) 

Figure  31.  The  modulation  transfer  function  for  the  diffuser  display 
(shown  in  Fig.  30)  as  a function  of  the  spatial  frequency 
on  the  display,  f,  times  the  plate  spacing,  d.  These  results 
show  that  for  all  plate  spacings  the  form  of  the  modulation 
transfer  functions  is  identical. 


spacing  of  0.013  cm.  Therefore,  the  screen  HTF  can  be  considered  as  a small 
off-set  in  the  plate  spacing,  d.  Where  required,  this  off-set  has  been  included 
in  our  results. 


3.  Scene  Characteristics 

Two  types  of  still  Images  were  Investigated  in  this  study:  representation- 

al scenes  in  black-and-white  and  color,  and  single-transition  luminance  edges. 
The  luminance  edges  were  studied  because  of  the  unique  role  they  play  In  the 
derivation  of  our  descriptors,  and  because  of  their  importance  in  the  composi- 
tion of  natural  scenes  (see  Section  III). 

The  representational  scenes  we  used  are  shown  in  Fig.  32.  The  top  picture, 
of  a manikin,  was  studied  in  both  black-and-white  and  color.  The  significant 
features  of  this  image  are  its  large  characteristic  feature  size,  the  texture  in 
the  straw  hat,  and  the  large-area  colors  of  extreme  saturation.  By  accentuating 
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the  colors  in  this  Image  we  attempted  to  magnify  possible  differences  in  sub- 
jective sharpness  between  images  in  black-and-white  and  in  color.  The  stadium 
crowd  scene  shown  in  Fig.  32  was  represented  by  a characteristic  feature  size 
approximately  one-tenth  that  of  the  manikin  scene.  This  scene  was  studied  only 
in  color. 

Two  luminance  edges  were  investigated,  as  shown  in  Fig.  33,  with  contrasts 

of  82  and  122,  respectively.  The  definition  for  edge  contrast  is  given  in 

Fig.  34  as  (I  - I . )/(I  + I . ).  These  two  edges  were  selected  as 

max  min  max  min 

representative  of  the  range  of  contrasts  that  may  reasonably  be  expected  in 
actual  images. 

Figure  35  shows  the  average  of  the  luminance  power  spectral  densities 

in  the  horizontal  direction  for  both  the  manikin  and  stadium  crowd  scenes  as 

a function  of  the  diffuser  display  frequency.  These  spectra  were  obtained 

by  the  use  of  the  techniques  outlined  in  Section  III  with  a TV  camera.  That 

is,  as  shown  schematically  in  Fig.  14,  the  power  spectra  were  obtained  from 

the  modulation  terms  I after  the  average  scene  luminances  I were  subtracted 

m 

off.  As  explained  in  Section  III,  this  technique  has  the  advantage  of  mini- 
mizlng  the  influence  of  the  finite  picture  size  on  the  measurements  of  the 


power  spectra. 

The  power  spectra  for  the  manikin  and  stadium  crowd  scenes  clearly  re- 
flect the  important  features  of  each  image.  The  spectrum  for  the  crowd  scene 
is  relatively  flat  up  to  a frequency  that  is  roughly  equal  to  the  Inverse 

width  of  one  spectator.  Above  this  frequency  the  spectrum  rolls  off  approxi- 
2 

mately  as  1/w  . The  power  spectrum  for  the  manikin  scene  rolls  off  approxi- 
2 

mately  as  1/tu  at  all  but  the  lowest  spatial  frequencies,  manifesting  the 

larger  scale  of  this  scene.  The  average  normalized  modulation  depth,  VT/x. 

for  the  manikin  scene  was  0.43;  for  the  stadium  crowd  scene  it  was  0,47. 

At  high  spatial  frequencies  the  power  spectra  for  the  two  luminance  edges 

2 

shown  In  Fig.  33  also  roll  off  as  1/ui  . For  spatial  frequencies  below  roughly 
the  inverse  of  one  picture-width  the  spectra  progressively  flatten  as  the  fre- 
quency is  decreased.  It  is  interesting  to  note  that  although  the  stadium 
crowd  scene,  the  manikin  scene,  and  the  luminance  edges  appear  very  dissimilar, 
over  a very  wide  range  of  spatial  frequencies  their  average  power  spectra  are 
of  almost  identical  form. 
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DISPLAY  FREQUENCY  (CYCLES /cm) 

! Figure  35.  Average  horizontal  luminance  power  spectra  for  the  crowd  scene 

1 and  the  manikin  scene  as  a function  of  display  frequency.  At 

high  spatial  frequencies  both  spectra  roll  off  approximately 
as  l/s*.  These  spectra  were  obtained  from  the  Im  luminance 
-I  terms,  as  shown  in  Fig.  14. 
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4.  Scene  Modulation  Transfer  Functions 

In  Figs.  36  and  37  the  various  processes  required  to  produce  the  Images, 
described  in  the  preceding  section,  are  summarized.  The  luminance  edges  were 
formed  by  back— illuminating  Wratten  neutral-density  filters  that  were  placed 
over  the  input  plane  of  the  diffuser  display  (see  Fig.  29).  The  overall  MTF 
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Figure  36.  System  elements  necessary  for  the  generation  of  luminance 
edges.  Because  the  luminance  edges  were  produced  by  the 
back- illumination  of  neutral-density  filters  placed  direct- 
ly over  the  input  image  plane,  they  were  degraded  only  by 
the  screen  MTF  before  filtering  by  the  diffuser  MTF.  Thus, 
the  limiting  quality  of  the  edges  at  point  A was  excellent. 
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Figure  37.  System  elements  necessary  to  produce  the  representational 
scenes  of  Fig.  32.  The  overall  system  MTFs  were  obtained 
for  each  image  by  measuring  the  IJTFs  at  point  A.  There- 
fore, all  degrading  influences  up  to  the  diffuser  display 
were  accounted  for.  The  measured  MTFs  are  shown  in 
Fig.  38. 


for  this  arrangement  is  given  simply  by  the  product  of  the  screen  MTF  and  the 
diffuser  display  MTF,  as  shown  in  Fig.  36.  Since  the  luminance  edges  were 
degraded  only  by  the  screen  MTF,  whose  bandwidth  was  in  excess  of  50  cycles/cm 
(—3  dB),  the  limiting  sharpness  for  these  edges  was  excellent. 
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In  Fig.  37  the  basic  steps  necessary  to  produce  the  representational  scenes 
of  Fig.  32  are  summarized.  For  the  color  and  black-and-white  images  of  the 
manikin,  the  MTFs  were  obtained  by  photographing,  under  identical  conditions, 
a series  of  sine-wave  gratings  of  different  spatial  frequency.  These  gratings 
were  then  used  to  measure  the  system  MTF  on  the  diffuser  display  with  a plate 
separation  of  zero.  Thus  the  resulting  MTFs,  shown  in  Fig.  38,  include  all  of 
the  degrading  influences  up  to  the  diffuser  display. 
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Figure  38. 


The  measured  MTFs  for  the  two  representational  scenes  used 
in  these  experiments  (Figs.  32-37). 


The  35-mm  color  slide  of  the  manikin  was  produced  with  Kodachrome  25,  a 

The  black-and-white  35-nan  slide  of 


direct-reversal  film  of  excellent  quality. 

* 

the  manikin  was  obtained  with  Panatomic  -X  negative  film,  which  was  then  printed 


on  Kodak  5302  positive  stock.  The  extra  processing  step  explains  why  the  KTF 
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for  the  black-and-white  slide  is  less  than  the  MTF  for  the  color  slide  of  the 
manikin.  The  crowd  scene  was  obtained  from  a 3-in.  x 5-in.  master  color  print 
of  excellent  quality.  The  KTF  reported  in  Fig.  38  for  this  scene  represents 
the  processing  steps  necessary  to  reduce  it  to  a 35-mm  color  slide.  It  is  as- 
sumed that  the  additional  KTF  loss  inherent  in  the  original  was  small  compared 
to  the  KTF  loss  in  subsequent  processing  steps.  Most  significantly,  however, 
this  slide  was  studied  primarily  at  very  low  diffuser  bandwidths  where  the 
effects  of  the  processing  MTFs  are  negligible. 

It  should  be  noted  that,  as  discussed  in  Section  III  (see  also  Fig.  35), 

the  expected  fora  of  the  power  spectra  for  these  scenes,  at  high  spatial  fre- 
2 

quencies,  is  1/to  . This  assumption  makes  it  possible  to  reconstruct  the  spectra 
for  these  scenes  on  the  diffuser  display  with  the  aid  of  Fig.  38. 

5.  Observer  Modulation  Transfer  Functions 

The  visual  MTF  used  in  the  computations  of  Section  IV. D.  was  derived  from 
the  visual— contrast-sensitivity  function  shown  as  a solid  line  in  Fig.  39. 

This  curve  was  obtained  in  a previous  study  (TR2)  with  an  average  screen 
brightness  of  35  mL  and  a display  that  subtended  approximately  7.0°  of  visual 
angle. 

The  data  points  shown  in  Fig.  39  represent  the  measured  contrast-sensi- 
tivity values  for  the  two  observers  who  participated  in  these  experiments. 

These  data  wsre  obtained  by  the  method  of  adjustment.  That  is,  at  each  spatial 
frequency  the  contrast,  of  the  sine-wave  grating  was  reduced  until  the  observer 
stated  that  the  grating  was  just  visible.  These  experiments  were  performed 
with  the  diffuser  display  at  an  average  screen  brightness  of  35  mL;  each 
point  represents  the  average  of  ten  experiments. 

f.  EXPERIMENTAL  PROCEDURE 

Observers  viewing  the  diffuser  display  were  asked  to  determine  which  one 
of  two  images,  presented  sequentially  and  with  different  bandwidths,  appeared 
to  be  the  sharper.  The  observers  were  instructed  to  look  at  the  central  third 
of  the  display  and  use  whatever  clues  they  felt  would  best  enable  them  to  dis- 
tinguish the  sharper  image.  The  Images  were  presented  continuously  on  the 
screen,  as  their  bandwidths  were  changed  from  one  value  to  the  other.  The  pace 
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Figure  39.  Measured  contrast-sensitivity  functions  for  the  two  observers 
who  participated  in  this  study.  The  solid  line  was  used  to 
define  a visual  MTF  for  use  in  analytical  computations . 


of  each  experiment  was  controlled  by  the  observers,  who  flipped  a switch  on  a 
small  hand-held  box  to  alternate  between  the  two  images  presented.  No-choice 
decisions  were  not  allowed  unless  the  observer  was  distracted. 

Data  points  *«re  obtained  according  to  the  following  procedure:  The 

bandwidth  of  the  display  was  set  at  two  values,  and  v^,  where  V£  “ + 

Ay.  The  observer,  sitting  at  a fix=d  viewing  distance  and  viewing  the  screen 
normally,  alternated  between  the  two  images  once.  Then  he  would  decide  which 
of  the  two  images,  the  first  or  the  second,  was  the  sharper,  and  record  his 
answer  by  pressing  the  appropriate  button  cu  the  box  that  he  held.  After  each 
choice  he  would  be  informed  as  to  the  correctness  of  his  decision.  Subsequently 
the  order  of  the  two  images  was  again  randomized.  Thus,  5QZ  of  the  time  the 
first  image  would  be  the  sharper  and  SOZ  of  the  time,  the  second  image.  After 
5G  transitions  at  these  two  display  bandwidth*  the  observer's  percentage  of 


correct  answers  was  recorded,  and  the  value  of  $v  changed.  This  procedure  was 
repeated  for  different  values  of  Sv  until  the  psychometric  function,  P(dv),  at 
selected  values  of  throughout  the  bandpass  of  the  eye,  was  established.  It 
was  found  from  preliminary  experiments  that  P(5v),  which  Is  defined  as  the  pro- 
bability of  correctly  choosing  the  sharper  of  two  images  at  a specific  value 
of  as  a function  of  6v,  iu  well  approximated  by  a normal  error  curve 
(a  straight  line  on  arithmetic  probability  paper).  All  the  results  presented 
in  this  section  are  for  P(6v)  *=  0.75,  assuming  that  the  data  points  at  each 
are  deacrroed  by  a normal  distribution. 

Two  subjects,  each  with  better  than  20-20  vision  (Snellen  rating),  were 
hired  to  perform  these  experiments.  They  were  both  well  motivated  and  highly 
experienced  in  the  performance  of  these  tests.  They  used  normal  binocular 
vision  to  view  the  display. 

The  experiments  were  performed  at  viewing  distances  of  either  128  or 
400  cm.  These  viewing  distances  result  in  display  widths  of  24  and  7°,  re- 
spectively. The  effect  of  the  different  display  sizes  on  the  visual-contrast- 
sensitivity  function  is  shown  in  Fig.  39.  The  illumination  around  the  dis- 
play was  approximately  one  eighth  of  the  average  screen  luminance.  This  ratio 
was  chosen  to  assure  both  good  screen  contrast  and  good  visual  performance.  In 
a previous  study  (TR2 ) , intended  to  determine  the  effect  of  surround  luminance 
on  the  visual-contrast-sensitivity  inaction,  we  found  the  contrast-sensitivity 
function  not  to  be  affected  severely  by  surround  luminances  of  less  than  a 
factor  of  ten  higher  or  lower  than  the  mean  display  luminance. 

D.  THE  PERCEIVED  EDGE-GRADIENT  CONTENT 

In  this  section  we  will  define  a quantity  proportional  to  the  square  root 

of  the  visual  capacity;  we  will  use  this  quantity  to  interpret  the  measured 

subjective  sharpness  results  presented  in  the  next  section.  In  TR1  we  showed 

2 

that,  as  a direct  consequence  of  the  1/w  form  of  the  power  spectra  of  natural 
scenes,  this  quantity  is  proportional  to  the  number  of  edge  transitions  that  can 
be  perceived  across  a display.  In  this  context  we  will  show  that'  a constant 
change  in  the  number  of  perceived  edges  can  be  related  to  the  subjective 
sharpness  of  displayed  images. 
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We  start  from  the  definition  of  the  reduced  perceived  brightness  (TR2, 
p,  6)  expressed  as 

e = E/[<E2>]1/2  (2) 

The  aean-square  perceived-gradlent  content  is  given  by 


g2  = <(3e/36)2> 

= <(3E/30)2>/<E2> 


(3) 


where  9 is  the  viewing  angle.  This  quantity  can  be  interpreted  physically 
as  the  average  inverse  square  of  the  angular  distance  between  perceived 
transitions. 

If  we  now  assume  the  scenes  isider  consideration  to  have  power  spectra  of 
2 

the  form  1/w  , we  obtain 

t1  f ~ |R(w)!2  Q2 (wr/2n) 

2 o T ,, , 

g . (4) 

/ ~~  !&(w>|2  02(«i»r/2ir) 

o 

where  r is  the  viewing  distance.  Note  that  the  numerator  is  the  visual 

capacity.  C (TR1 ) , while  the  denominator  is  the  perceived  mean-square  signal 
2 v 2 

power,  S , for  an  input  power  spectrum  of  the  form  i(ui)  - 1/u  . Thus  we  have, 

for  the  rms  perceived-gradlent  content, 

g - rCy1/2/S  (5) 

We  assume  that  the  perception  of  & change  in  sharpness  of  a display  is 
governed  by  the  quantity  g in  that  a change  Ag  is  required  before  an  observer 
can  perceive  a sharpness  difference.  We  further  assume  that  the  required  change 
Ag  is  independent  of  the  bandwidth. 

The  original  assu^tion  that  the  relevant  psychophysical  quantity  involves 
the  square  of  the  input  signal,  in  this  case  the  perceived  mean-square  gradient, 
is  supported  by  our  quantitative  interpretation  of  the  suprathreshold  sine-wave 
results  of  Nachmias  and  Sansbury  [31],  as  described  in  Section  V.  The  partic- 
ular form  of  g is  reasonable,  since  it  is  a measure  of  the  average  structure. 
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or  luminance  variation,  perceived  by  the  viewer  in  the  displayed  image.  Thus 
cur  assumption  about  Ag  is  equivalent  to  postulating  that  a constant  change 
in  the  average  perceived  structure  of  an  image  is  required  to  produce  equal 
changes  in  image  sharpness. 

In  order  tc  compute  g the  following  analytic  form  for  R(<d)  was  nsedt 

?(«)  * [1  + 0.951 (w/wo)2  + 0.0495 (w/o^)6]”1  (6) 

where  is  the  -3-dB  point  of  R(w),  i.e.,  R(idq)  = 1/2.  In  Fig.  39  the  close- 
ness of  this  fit  to  tne  measured  KTF  can  be  seen.  The  form  for  0(v)  was  taken 
from  our  threshold-contiast-sensitivity-functiou  measurements  (TR2,  p.  95)  with 
a display  diameter  of  6.5°.  In  Fig.  39  it  is  shown  that  this  value  for  0(v) 
is  an  accurate  representation  of  the  visual  MTFs  of  the  observers  who  partici- 
pated in  this  study. 

The  psychophysical  quantity  Ag  was  computed  by  determining,  at  a specific 
wq,  the  change  in  display  bandwidth  Am^  required  to  produce  a perceivable  change 
in  display  sharpness.  Thus,  the  condition  to  be  satisfied  is 

Ag  ■ g(iDQ  + AoJq)  - g(i;  ) = constant  (7) 

The  magnitude  of  Ag  represents  the  only  adjustable  parameter.  It  determines 

the  scale  of  the  required  bandwidth  changes,  but  for  small  it  does  not 

alter  the  form  of  the  variation  of  Aid  with  u . The  computed  value.  ■'resented 

o o 

in  the  next  section,  were  obtained  by  assuming  that  Ag  = 1/7.5  deg  * It  will 
be  siuuwn  that  this  value  represents  an  accurate  approximation  to  our  experi- 
mental results. 

It  is  ixq^ortant  to  realivse  that  the  results  obtained  for  large  (i.e., 
wQr/2svn  v 4,  where  vq  » 3 cycles/degree-of-visicn)  are  insensitive  to  the  details 
of  the  form  of  the  MTF  of  visual  system  0(v),  but  depend  crucially  on  the  low- 
frequency  behavior  of  the  display.  For  example,  at  i>}qt/2v\>q  * 6 the  -3-dB 
point  of  the  display  lies  at  about  18  cycles/degree-of-vision,  where  the  sensi- 
tivity of  the  human  visual  system  is  small.  However,  the  peak  of  the  visual 
KTF  lies  at  a display  frequency  at  which  R(w)  is  greater  than  0.97.  Thus  the 
pezveived  effect  of  changes  in  the  display  bandwidth  parameter  idq  arises  pri- 
marily from  changes  in  the  low-frequency  part  of  the  display's  spectral  range. 
Stated  differently,  the  perceived  sharpness  improvement  resulting  from  band- 
width increases  is  due  = nly  to  the  improvement  of  the  low-frequency  response. 
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Figure  40.  This  figure  shows  the  relative  spectral  increase  when  a 
display  (for  an  MTF  of  the  form  shown  in  Fig.  31)  with  a 
bandwidth  of  10  cycles/degree-of-vision  Is  increased  to  one 
of  11  cycles/degree-of-vision.  Although  the  Increase  in 
spectral  response  due  to  the  increase  iu  display  bandwidth, 
AR^(w),  occurs  throughout  the  bandpass  of  the  eye,  the 
perceived  response,  [AR2(ui)02(u)r/2ir)  ] , is  significant 
only  at  the  lower  spatial  frequencies. 


In  Fig.  40  this  point  is  illustrated  for  a display  with  bandwidth  wQr/2TT  =»  10 
cycles/degree-of-vision  and  a change  in  display  bandwidth  of  Atoor/2rr  - 1 cycle/ 
degree-of-vision.  In  this  figure  we  have  plotted  the  relative  spectral  response 
of  the  quantities 

AR2(w)  = |r<co)  i2  - |r(w)I2 


at  u = u’  + Aw  at  w = to 

o o o O u 

and  ARi(ui)02(y^-)  as  a function  of  retinal  frequency.  It  can  be  seen  from 
Fig.  40  that  although  the  change  in  display  bandwidth  Awq  results  in  an  in^ 
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crease  in  AR  (w)  throughout  the  bandpass  of  the  eye,  the  perceived  response 
AR^(w>0^(^->  is  significant  only  near  the  peak  of  the  visual  KTF  (which  lies 
at  low  display  frequencies).  The  contribution  in  spectral  response  for  retinal 
frequencies  between  10  and  35  cycles/degree-of-vision  is  negligible!  This 
observation  has  strong  im  Ications  for  display  design.  For  many  displays  of 
practical  interest  the  perceived  sharpness  can  be  improved  simply  by  increas- 
ing the  low-frequency  response,  rather  than  by  extending  the  display  bandwidth. 

On  the  other  hand,  for  small  display  bandwidths  [(uor/2Trvo)  v 1],  the 
predicted  subjective  sharpness  results  are  sensitive  to  the  precise  form  of 
the  visual  MTF,  0(\>).  We  will  return  to  this  point  later. 

E.  RESULTS 

The  measured  results  are  shown  in  Figs.  41-45,  plotted  with  (V2-V1^V2  as 
a function  of  The  frequencies  and  are,  respectively,  the  lower  and 

higher  bandwidths  (defined  at  R(coo)  = 1/2)  necessary  for  an  observer  to  dis- 
cern 1 jnd  in  image  sharpness  as  described  in  Section  IV. C.  Each  data  point 
on  the  figures  represents  over  200  choices  between  two  images  with  different 
bandwidths.  The  coordinators  are  chosen  so  that  when  (v2~v^)/v2  = 1,  the  band- 
width must  be  extended  from  to  infinity  to  produce  one  additional  jnd  in 
image  sharpness.  Also  shown  in  these  figures  are  the  calculated  results  based 
on  the  results  o'  Section  IV. D, 

The  quantity  (V2~v^)/v2  in  these  figures  represents  close  to  the  minimum 
change  in  bandwidth  that  can  be  detected.  The  values  were  obtained  under  optimum 
conditions  by  highly  trained  observers  viewing  still  images  on  a noiseless, 
large,  bright  display.  When  conventional  displays,  such  as  commercial  televi- 
sion, are  viewed,  1 jnd  in  image  sharpness  would  represent  an  almost  impercep- 
tible change.  However,  a change  in  sharpness  corresponding  to  3 jnd’s  would, 
in  most  situations,  represent  a conspicuous  improvement. 

An  obvious  feature  of  the  measured  results  is  that,  for  the  two  observers 
studied,  the  measured  values  for  Cv^— /v2  at  each  ^ differ  by  as  much  as  a 
factor  of  2.  This  is  because  individual  observers  (a)  will  have  different 
criteria  for  deciding  a jnd  in  image  sharpness,  (b)  will  generally  look  at  the 
display  in  different  locations,  and  (c)  will  not  be  identical  in  their  visual 
performance.  In  a preliminary  study  with  ten  observers  we  found  that  at  * 

4 cycles/degree-of-vision  for  the  manikin  scene  the  measured  jnd's  varied  by 
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Figure  41.  Measured  results,  (v2  - v^)/v2  as  a function  of  v^,  for  an 

822-contrast  luminance  edge.  One  jnd  in  image  sharpness  was 
defined  as  the  difference  in  bandwidth  (Sv  ■ V2  - v^)  from 
vi  necessary  for  an  observer  to  perceive  a change  in  image 
sharpness  752  of  the  time.  Each  data  point  is  the  average 
of  over  200  events,  and  all  retinal  frequencies  are  defined 
at  the  point  at  which  R(u)  = R(2nv/r)  = 1/2. 


4 to  1 from  the  most  sensitive  to  the  least  sensitive  observers.  The  two  ob- 
servers reported  here  (P.R.  and  B.S.)  were  both  above  average  in  sensitivity  when 
compared  with  this  group. 

Our  results  differ  from  Baldwin's  in  that,  independent  of  the  size  of 
the  point-spread  functions,*  the  linear  change  in  their  widt’.  necessary  to 
produce  1 jnd  in  image  sharpness  was  not  constant.  We  found  that  the  linear 
change  in  the  width  of  the  point-spread  functions  increased  at  both  high  and 


*For  the  diffuser  display  described  in  Section  IV. B,  the  width  of  the  point- 
spread  function  is  directly  proportional  to  the  plate  spacing,  d.  The  re- 
lationship between  plate  spacing  and  display  MTF  is  shown  in  Fig.  31. 


79 


* -■* 


12%  CONTRAST  EDGE 
400cm  128cm 


o ?«o  • 


RETINAL  FREQUENCY  vt  ( CYC  LES  / DEGREE-OF-VIS  ION  ) 

Figure  42.  Measured  results,  (vj  - v±)/\>2  as  a function  of  for  the 
12%-contrast  luminance  edge.  Oi.e  jnd  in  image  sharpness  was 
defined  as  the  difference  in  bandwidth  (6v  - v2  - vl)  from 
necessary  for  an  observer  to  perceive  a change  in  image 
sharpness  75%  of  the  time.  Each  data  point  is  the  average 
of  over  200  events,  and  all  retinal  frequencies  are  defined 
at  the  point  at  which  R(w)  = R(2irv/r)  = 1/2. 

low  spatial  frequencies  with  a minimum  value  in  the  range  of  3—7  cycles/degree- 
"vision.  This  is  a result  that  was  successfully  predicted  by  the  analysis 
presented  in  Section  XV. D. 

The  measured  results  for  each  cf  the  images  studied  are  similar  in  form. 
Also,  it  can  be  seen  that  our  analytical  results  are  in  good  agreement  with 
the  measured  results  for  display  frequencies  above  10  cycles/degree-of-vision. 
This  is  significant  since  most  practical  displays  are  situated  in  this  fre- 
quency range.  This  is  also  the  frequency  range  where  the  detailed  shape  of 
the  visual  KTF,  0(v),  is  least  important,  and  where  our  conclusion  as  to  the 
power  spectra  of  natural  scenes  having  the  form  1/w  is  most  valid.  At  the 
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lowest  frequencies,  however,  our  analytical  prediction  consistently  overesti- 
mates the  slope  of  (\>2~v^) /v2  vs  vl*  This  difference  may  be  attributed  to 
several  possible  causes,  including  the  assumptions  of  our  model.  Yet  this 
flattening  might  in  part  be  a consequence  of  a suprathreshold  visual  MTF  that 

does  not  attenuate  as  quickly  as  we  have  assumed  it  would  at  low  spatial  fre- 

* 

quencies.  This  possibility  is  given  direct  support  by  the  work  of  Georgeson 
and  Sullivan  [32],  They  measured  a suprathreshold  MTF  for  the  visual  system, 
using  a contrast  matching  technique,  and  found  that  the  measured  MTFs  were 
flattened  over  the  middle  range  of  retinal  frequencies  (0.5-10  cycles/degree- 
of-vision)  for  sine-wave  gratings  with  contrasts  above  threshold.  If  we  were 
to  incorporate  into  our  formalism  [Eq.  (4)]  a visual  MTF  with  these  properties, 
the  resulting  predictions  for  the  scaling  of  (v^-v^/v^  vs  would  come  very 
close  to  the  measured  results.  Or,  conversely,  if  we  were  to  determine  the 
visual  MTF  that  best  fits  our  measured  results,  we  would  define  the  correct 
operational  suprathreshold  visual  MTF  to  use  in  the  analysis  of  natural  Images. 

Now  consider  the  results  for  the  two  luminance  edges  studied,  shown  In 
Figs.  41  and  42.  These  results  are  among  our  most  important  because  they  were 
obtained  with  essentially  perfect  input  edges,  all  of  whose  unfiltered  power 

2 

spectra  are  of  the  form  1/w  . Thus,  one  of  the  assumptions  of  our  model  is 
met  exactly,  except  at  the  lowest  spatial  frequencies  where  the  power  spectra 
are  determined  by  the  picture  width.  However,  as  our  previous  results  (TK2, 
p.  95)  with  sine-wave  gratings  have  shown,  a display  24°  wide  is  essentially 
of  infinite  width  perceptually. 

At  the  highest  spatial  frequencies  the  results  of  Figs.  41  and  42  are 
practially  identical.  Also,  the  correspondence  between  the  measured  values 
and  the  theoretical  prediction  is  excellent.  At  lower  spatial  frequencies 
two  observations  can  be  made.  First,  the  jnd  for  the  82%  contrast  edge  is 
approximately  80%  smaller  than  the  corresponding  jnd  for  the  12%  contrast 
edge.  This  difference  in  sensitivity  is  quite  small  when  conq>ared  to  the 
almost  50-to— 1 difference  in  signal  power  between  these  two  images.  Thus, 
our  assumption  of  normalizing  the  reduced  perceived  brightness  with  the  rms 
perceived  brightness  from  the  display  is  supported.  Second,  note  that  the 


*An  additional  complication  will  be  discussed  below. 
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Figure  43.  Measured  results,  (V2  ~ vl)/y2  as  a Function  of  for  the 

manikin  scene  in  black-and-white.  One  jnd  in  Image  sharpness 
was  defined  as  the  difference  in  bandwidth  (dv  * V2  - v^)  j 

from  \>\  necessary  for  an  observer  to  perceive  a change  in 
image  sharpness  75%  of  the  time.  Each  data  point  is  the  j 

average  of  over  200  events,  and  all  retinal  frequencies  are  j 

defined  at  the  point  at  which  R(<u)  « R(2irv/r)  = 1/2.  j 


measured  values  of  (v2~v^)/v2  for  the  82%  contrast  edge  are  practically  in- 
dependent of  Vj  from  0.1  to  2 cycles/degree-of-vision,  whereas  over  this  fre- 
quency range  the  measured  values  of  (v2~v^)/v2  for  the  12%  contrast  edge  de- 
crease slightly  with  increasing  frequency.  That  is,  these  results  are  in  gen- 
eral agreement  with  the  proposition  that  the  visual  MTF  is  a function  of  scene 
contrast;  the  shape  of  the  MTF  flattens  with  increasing  contrast.  Third,  the 
measured  results  for  the  82%  contrast  edge  are  almost  identical  to  the  measured 
results  for  the  representational  scenes.  This  fact  is  additional  indirect  evi- 
dence that  edges  are  the  most  significant  feature  of  natural  scenes. 
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Figure  44.  Measured  results,  (v£  - as  a function  of  v^,  for  the 

manikin  scene  in  color.  One  jnd  In  image  sharpness  was 
defined  as  the  difference  in  bandwidth  (6v  * - v-^) 

from  vj_  necessary  for  an  observer  to  perceive  a change  in 
image  sharpness  75 X of  the  time.  Each  data  point  is  the 
average  of  over  200  events,  and  all  retinal  frequencies  are 
defined  at  the  point  at  which  R(w)  » R(2xv/r)  * 1/2. 


Next  consider  the  results  obtained  by  the  use  of  the  black-and-white  and 
colored  manikin  scenes  (see  Figs.  43  and  44).  For  frequencies  above  approxi- 
mately 5 cycles/degree-of-vision  the  measured  results  are  consistently  lower 
than  the  predicted  results.  This  is  an  expected  finding  due  to  the  limited 

bandwidth  of  these  images,  as  9hovn  in  Fig.  38.  It  is  possible  to  interpret 
2 

this  result  l/u>  ) as  arising  from  a display  whose  actual  frequency  response  is 
the  product  of  the  appropriate  MTFs  of  Figs.  31  and  38.  If  the  MTFs  for  the 
images  are  included  in  the  analysis,  the  result  is  to  produce  the  desired  cor- 
respondence between  theory  and  experiment. 
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Figure  45.  Measured  results,  (v2  - vi)/\>2  as  a function  of  for  the 
crowd  scene  in  color.  One  jnd  in  image  sharpness  was 
defined  as  the  difference  in  bandwidth  (6v  »*  - v^) 

from  v^  necessary  for  an  observer  to  perceive  a change  in 
image  sharpness  75Z  of  the  time.  Each  data  point  is  the 
average  of  over  200  events,  and  all  retinal  frequencies  are 
defined  at  the  point  at  which  R(u)  ■ R(2nv/r)  * 1/2. 


At  the  lowest  spatial  frequency  the  differences  in  the  results  of  the 
colored  and  black-and-white  manikin  scenes  are  not  large  enough  to  allow  spec- 
ific conclusions  to  be  drawn.  Our  general  appraisal  of  these  measurements  is 
that  the  addition  of  color  to  an  image  does  not  appreciably  improve  the  jnd  in 
image  sharpness.  This  conclusion  is  consistent  with  our  current  understanding 
of  luminance  and  chrominance  information  processing  in  the  human  visual  system 
(see  Section  V) . 

In  Fig.  45  the  results  of  our  measurements  of  the  stadium  crowd  scene  are 
given.  This  slide  was  originally  included  because  at  low  spatial  frequencies 
it  has  a spectrum  that  is  relatively  flat  conpared  to  the  manikin  scene.  How- 
ever, the  measured  results  for  the  crowd  scene  and  the  manikin  scene  are  ob- 
viously indistinguishable,  especially  at  low  spatial  frequencies.  This  result 


84 


is  encouragi ng  because  it  impMes  that  the  spe-tral  variation  among  scenes  will 
have  to  be  considerably  greater  than  the  one  that  exists  between  these  two 
scenes  (see  Fig.  35)  in  order  to  significantly  affect  the  scales  of  subjective 
sharpness  reported  here.  However,  as  we  have  shown  in  Section  III,  the  power 
spectra  for  natural  scenes  are  all  of  remarkably  similar  form.  We  therefore 
conclude  that  our  measurements  on  subjective  sharpness  will  be  applicable  to  a 
wide  variety  of  natural  scenes. 

Finally,  throughout  this  section  we  have  tacitly  assumed  that  image  sharp- 
ness is  a meaningful  concept,  whatever  the  display  bandwidth.  In  fact,  how- 
ever, the  concept  applies  only  as  long  as  the  images  under  study  are  recog- 
nizable as  images.  For  display  bandvidths  that  are  extremely  low,  the  images 
become  amorphous  shapes  for  which  the  commonly  understood  meaning  of  sharpness 
has  little  significance.*  Indeed,  at  very  low  bandwidths  the  distinct  im- 
pression of  the  experiment  converts  from  a judgment  of  a semantic  sharpness 
change  to  one  of  a semantic  contrast  change.  This  transition  occurs  at  about 
1 or  2 cycles/degree-of-vision.  If  we  accept  the  proposition  that  the  spatial- 
frequency-specific  channels  of  the  visual  system  are  roughly  1 cycle/degree— 
of-vision  wide  [33],  then  this  transition  occurs  when  only  one  or  two  channels 
are  excited.  If  the  energy  within  each  channel  is  sumned  before  a detection 
apparatus  is  applied,  then  at  the  lowest  frequency,  where  only  one  channel  is 
excited,  the  measured  change  in  bandwidth  necessary  to  produce  1 jnd  should  be 
the  same  for  either  a scene  with  a broad  spectrum  (such  as  the  crowd  scene)  or 
a scene  with  a narrow  spectrum  (such  as  a single  sinusoid)  at  the  same  average 
contrast.  This  is  true  even  though  the  latter  experiment  must  involve  only  a 
change  in  the  aontY'ast  of  the  sinusoid.  We  tested  this  hypothesis  using  a 
sine-wave  grating  at  0.14  cycles/degree-of-vision  whose  general  impression 
of  contrast  (22%)  was  the  same  as  that  of  the  crowd  scene  with  a display  fre- 
quency of  = 0.14  cycles/degree-of-vision.  For  these  conditions  the  measured 
value  of  (v2-vi^v2  necessary  to  produce  1 jnd  in  contrast  change  was  0.6. 

The  comparable  value  for  the  crowd  scene  was  0.7.  Thus,  at  the  lowest  spatial 
frequencies  we  find  our  measurements  for  the  jnd  in  sharpness  to  be  indis- 
tinguishable from  the  results  of  an  experiment  whose  objective  had  been  to 
measure  the  jnd  in  contrast  of  a single  sinusoid. 


*This  also  applies  for  the  edges  studied. 
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F.  SUMMARY  OF  CONCLUSIONS 


We  have  obtained  the  first  results  which  relate  the  perception  of  image 
sharpness  to  the  modulation  transfer  function  of  a display.  Our  specific 
findings  are  as  follows: 

(1)  The  jnd  in  image  sliarpness  is  not  determined  by  a constant  change 
in  the  linear  size  of  the  display  point-spread  fimctlon.  The 
change  in  the  point-spread  function  is  a minimum  at  about  3-5 
cycles/degree-of-vision  and  becomes  progressively  larger  for  display 
frequencies  away  from  this  point. 

(2)  The  scales  of  subjective  sharpness  are  not  appreciably  different 
for  black-and-white  and  for  colored  images.  From  this  we  con- 
clude that  image  sharpness  is  determined  primarily  by  the  luminance 
portion  of  images. 

(3)  The  measured  jnd*s  in  sharpness  for  single-transition  luminance 
edges  and  conventional  images  are  indistinguishable.  This  finding 
supports  our  view  that  edges  are  the  dominant  feature  in  natural 
images. 

(4)  We  have  shown  that  at  high  frequencies  our  measured  results  are 
in  good  agreement  with  the  assumption  that  a jnd  in  image  sharp- 
ness is  determined  by  a constant  change  in  the  perceived  rms 
gradient  content  of  images  [Eq.  (7)].  At  low  spatial  frequencies 
our  results  support  the  contention  that  the  roll-off  in  the  visual 
KTF  is  more  gradual  for  high  scene  contrasts  than  it  is  for  scenes 
with  low  contrast.  We  believe  that  Eq.  (7),  for  the  change  in 
the  display  MTF  required  for  a perceivable  change  in  sharpness, 
can  be  applied  with  confidence  to  displays  with  MTF  forms  different 
from  that  employed  in  these  experiments. 

(5)  We  have  shown  that  the  judgment  of  sharpness  changes  is  not  a 
strong  function  of  scene  content.  This  result  is  partially  a 
consequence  of  the  high  degree  of  self-similarity  in  the  spectral 
properties  of  natural  scenes. 

(6)  At  very  low  spatial  frequencies  (y  1 cycle/degree-of-vision) , we 
have  established  that  the  judgment  of  a sharpness  change  is  in- 
distinguishable from  the  judgment  of  a contrast  change  for  a 
single  sinusoid. 


(7)  Finally,  we  have  obtained  the  scale  value  necessary  to  calibrate 
the  vi'-^al  capacity.  It  is  now  possible  to  specify,  by  means  of 
visual  capacity,  the  absolute  perceived  sharpness  of  different 
displays. 
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V.  STATISTICAL  THEORY  OF  DISPLAY  DESCRIPTORS 


A.  INTRODUCTION 

A major  goal  of  our  research  during  the  current  year  was  to  determine, 
from  an  information  theory  point  of  view,  the  effect  of  color  on  perceived 
image  quality.  In  our  previous  formulation  (cf.  TR1)  of  the  information  ca- 
pacity of  the  display-observer  system,  only  the  contribution  of  the  perceived 
luminance  signal  was  considered.  Accordingly,  we  have  sought  to  extend  the 
concept  of  the  information  capacity  so  as  to  encompass  the  psychophysical  di- 
mensions of  luminance  and  chrominance.  Such  a total  descriptor  would  not  only 
allow  us  to  assign  a quantitative  value  to  the  relative  performance  character- 
istics of  monochrome  and  color  displays  but  would  permit  display  designers  to 
optimize  the  allocation  of  available  resources  to  produce  the  best  balance  of 
luminance  and  chrominance  signal  capability.  At  the  outset  the  treatment  of  such 
disparate  visual  dimensions  as  luminance  and  chrominance  on  an  equivalent  basis 
may  seem  akin  to  comparing  apples  and  pears.  However,  it  will  be  seen  that  the 
formalism  of  information  theory,  when  combined  with  a simple,  verifiable  model 
of  the  human  visual  system,  allows  an  unambiguous  collective  treatment  of 
these  two  dimensions. 

In  order  to  proceed  with  the  synthesis  of  a combined  information  theory 
descriptor  for  luminance  and  chrominance,  it  was  necessary  to  develop  a far 
less  heuristic  measure  of  the  luminance  channel  capacity  than  the  quantity  in- 
troduced in  TR1.  In  particular,  the  assumption  that  the  visual  system  can  be 
regarded  as  a linear  system,  characterized  by  a single  modulation  transfer 
function,  has  been  removed.  Instead,  a nonlinear  signal-detection  model  has 
been  employed.  This  model  makes  use  of  existing  sine-wave  contrast-sensi- 
tivity data  and,  as  will  be  seen,  agrees  well  with  the  results  of  other  psy- 
chophysical experiments.  The  essential  aspects  of  the  model  are  discussed 
under  V.B.  below.  In  V.C.  Cite  ^del  is  employed  to  develop  the  first  measure 
of  the  information  capacity  of  the  display-observer  system  that  takes  into 
account  the  nonlinear  characteristics  of  the  human  visual  system.  In  V.D, 
the  information  theory  approach  is  applied  to  the  chrominance  dimension  in  a 
manner  consistent  with  that  utilized  for  the  luminance  dimension.  The  result 
is  an  expression  for  the  total  channel  capacity  of  the  display-observer  system. 
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including  both  luminance  and  chrominance.  This  expression  is  Chen  applied  to 
the  solution  of  a practical  design  problem  involving  the  optimum  trade-off 
between  luminance  and  chrominance  signal  capability. 

2.  THE  DISTRIBUTION  OF  PERCEIVABLE  SINE-WAVE  LUMINANCE  LEVELS 
1.  Signal-Detection  Model 

Our  approach  to  the  determination  of  the  channel  capacity  of  the  display- 
observer  system  requires  a knowledge  of  the  number  of  perceivable  sine-wave 
luminance  levels  as  a function  of  retinal  frequency  v.  Therefore,  before  pro- 
ceeding with  the  development  of  the  formalism  for  the  channel  capacity,  we 
shall  describe  a simple  detection  model  that  appears  to  account  for  the  ob- 
served distribution  of  these  levels. 

It  is  well  known  that  the  human  visual  system  can  be  characterized  by 
a sine-wave  threshold  sensitivity  function  m^,(u).  This  function  depends  some- 
what on  the  mean  luminance  of  displayed  sine-wave  gratings,  background  lumi- 
nance, and  display  size,  but,  as  was  shown  in  TR2,  the  function  is  remarkably 
constant  from  person  to  person  if  the  viewing  conditions  are  held  constant. 
Thus,  m^v)  can  indeed  be  regarded  as  a fundamental  property  of  the  visual 
system. 

However,  the  threshold-sensitivity  function  is  not  sufficient  to  deter- 
mine the  number  of  perceivable  luminance  or  contrast  levels  at  a given  retinal 
frequency.  For  example,  if  the  visual  system  were  linear,  the  contrast  or 
modulation  m^(v)  required  for  the  perception  of  i levels  would  be  sinply 

^(v)  = irn^v)  where  i = 1,  2....  (linear  model)  (8) 

On  the  other  hand,  a nonlinear  Weber’s  law  model  for  the  distribution  of 

levels  would  give  an  exponential  relationship.  For  a Weber's  law  model,  we 

have  Am/m.  = (m.  ,,-m.)/m,  = k (a  constant).  This  difference  equation  is 
l l+l  li  w 

easily  solved  to  give 

m^(v)  = (1  + kw)i~^mT(v)  (Weber's  law  model)  (9) 
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The  actual  distribution  is  more  complicated  than  either  of  the  above 
models.  As  may  be  seen  in  Fig.  46,  the  data  of  Kachalas  and  Sansbury  {31} 
show  that  the  required  contrast  change  Am  for  a jnd  in  contrast  [79.42  correct 
response  in  a forced-choice  experiment]  is  not  independent  of  the  value  of  the 
initial  contrast,  as  predicted  by  the  linear  model  Eq.  (8),  nor  does  it  in- 
crease linearly  with  the  initial  contrast,  as  required  by  the  Weber's  law 
model.  Indeed,  the  dependence  of  Am  on  starting  contrast  is  not  even  monotonic. 
The  required  Am  first  decreases  from  its  Initial  value  mj,,  goes  through  a 
minimum,  and  finally  does  approach  a Weber's  law  characteristic  at  high  values 
of  the  initial  contrast. 


CONTRAST  X10‘ 


Figure  46.  Required  contrast  increase  for  a jnd  in  contrast  for  3-cycle/ 
degree-of-vislon  gratings  as  a function  of  the  starting 
contrast  value.  The  data  points  are  taken  from  the  experi- 
ments of  Kachalas  and  Sansbury  [31].  Different  symbols 
represent  different  observers.  The  solid  curve  Is  the 
theoretical  fit  to  the  experimental  points,  based  on  the 
indicated  values  of  the  threshold  contrast  m^.  and  the  con- 
stant fraction  k. 


The  observed  behavior  can  be  accounted  for  by  a staple  square-law  detec- 
tion model.  Such  a model  has  been  discussed  in  the  psychophysics  litera- 
ture [31,34]  but,  to  the  best  of  out  knowledge,  has  not  been  applied  to  an 
analysis  of  the  results  of  the  measurements  of  the  jnd  in  contrast.  We  assume 

that  the  relevant  psychophysical  stimulus  in  the  discrimination  experiments  is 
2 

the  difference  AL  in  the  mean-square  luminance.  If  the  initial  contrast  is 

2 

mo,  the  expression  for  AL  is 

2 1 2 2 12  2 

Al/  = ^ (m  +-  Am)  LZ  - t a (10) 

z o o z o o 

where  is  the  average  luminance  of  the  displayed  gratings.  The  crucial 
statement  of  the  model  is  as  follows:  In  order  for  a contrast  difference  to 

be  perceived  with  a given  probability,  the  change  in  mean— square  luminance 
must  be  equal  to  a constant  fraction  of  the  interfering  signal.  In  general, 
the  interfering  signal  consists  of  the  sum  of  contributions  from  visual  noise, 
random  display  noise,  and  the  initial  value  of  the  mean-square  ltiminance. 
Mathematically,  this  statement  may  be  written  in  the  form 

AL2  = k[Nv(v)Av  + N(2^v/r)2xAv/r  + m2  L2]  (11) 

where  k is  a constant  fraction,  Nv(v)  is  the  visual-noise  spectral  powe  per 
unit  retinal  frequency,  N(ui)  is  the  display-noise  power  spectrum  as  a function 
of  the  angular  frequency  on  the  display  screen  u " 2xv/r  (r  is  the  viewing 
distance),  and  Av  is  the  width  of  the  appropriate  spatial-frequency-specific 
channel  of  the  visual  system  [34]. 

For  the  contrast  discrimination  experiments  of  Nachmias  and  Sansbury  [31], 
we  may  take  N(m)  = 0.  Then,  combining  Eqs.  (10)  and  (11)  and  solving  for  Am, 

\ki  easily  obtain 

1 

Am  = - m + [(1  + k)m2  + 2kN  Av/L2]2  (12) 

o O V o 

Since,  in  the  limit  mQ  •+  0,  Am  = m^.,  we  can  make  the  identification 
m^(v)  <=  2kNv(v)Av/L2 


(13) 


Thus.  Eq.  (12)  becomes 


1 

A.ii  - - m + [(1  + k)a2  + nu]2  (14) 

o o I 

2 2 

For  the  case  of  small  initial  contrast,  iaQ  <<  m^,  Eq.  (14)  reduces  to 

Am  = nip  - mQ,  thereby  predicting  an  initial  increase  in  observer  discrimin- 

ability  as  m is  increased  from  zero.  On  the  other  hand,  for  the  case  of 

° 22  1/2 
large  initial  contrast,  mQ  » ta^,  Eq.  (14)  becomes  Am  * [ (1  + k)  - 1]®0, 

indicating  Weber’s  law  behavior.  The  solid  curve  in  Fig.  46  was  generated 

_3 

from  Eq.  (14)  by  use  of  the  value  m^,  = 3.5  x 10  for  the  threshold-con  rast- 

sensitivity  and  k = 0.15.  It  is  seen  that  the  theoretical  curve  is  in  very 

good  agreement  with  the  experimental  results.  It  should  be  pointed  out  that 

an  analysis  based  on  a linear  detection  model  is  incapable  of  reproducing  the 

-3 

observed  initial  decrease  of  Am  and  the  consequent  minimum  at  m^  “ 8 x 10  , 

On  the  other  hand,  models  based  on  powers  higher  than  the  square-law  assumed 
here  may  be  able  to  account  for  the  experimental  observations,  but  in  view  of 
the  success  of  the  square-law  hypothesis  in  explaining  the  observed  Am(mo), 
consideration  of  these  more  complicated  models  seems  unwarranted. 

Given  the  form  of  Eq.  (7)  for  the  required  change  Am  for  a jnd  in  con- 
trast, one  can  obtain  the  modulation  level  necessary  for  the  perception  of 
i luminance  levels  at  a given  retinal  frequency.  First,  Eq.  (14)  is  written 
as  a difference  equation.  Setting  mQ  = m^  and  Am  <■  we  have 

1 

m±+1  - [(1  + k)m2  + m^]2  (15) 

Equation  (15)  is  easily  solved  to  give 

m2  = (rn^/k) [ (1  + k)1-!]  where  i = 1,2 (16) 

Figure  47  compares  the  calculated  distribution  of  levels  obtained  from  Eq.  (16), 

for  k = 0.15,  with  the  predictions  of  the  linear  model,  Eq.  (8),  and  the 

1/2 

Weber’s  law  model,  Eq.  (9),  with  k = [ (1  + k)  - 1) ] = 0.0724.  One  can  see 
that  there  are  considerable  differences  in  the  predictions  of  the  various 
models.  Because  of  the  initial  increase  in  discriminability  for  small  con- 
trast values,  the  actual  distribution  places  more  perceivable  levels  at  low 
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LEVEL  NUMBER  i 


Figure  47.  Computed  distribution  of  perceived  luminance  levels  for 

three  model  distributions.  Eq.  (16),  with  k * 0.15,  was  used 
to  compute  the  actual  distribution.  Eq.  (9),  with  = 

[(1  + k)l/2  - l]  * 0.0724,  was  used  to  compute  the  Weber's 
law  model. 


contrast  values  (below  about  40®^,)  than  does  the  linear  model.  For  contrast 
values  above  approximately  40®^,,  the  increase  of  the  required  level  spacing 
Am  at  high  contrast  values  dominates,  so  that  the  actual  distribution  requires 
a larger  modulation  depth  to  achieve  a given  number  of  perceivable  levels  than 
does  the  linear  model.  On  the  other  hand,  as  can  be  seen  from  the  figure,  an 
extrapolation  of  the  Weber's  law  regime  (Am  = k^m^)  of  the  actual  distribution 
to  small  contrast  values  would  consistently  underestimate  the  modulation  re- 
quired to  achieve  a given  number  of  perceivable  levels.  Thus,  one  cannot  ignore 
the  effect  of  the  enhanced  discriminability  at  low  contrast  levels  in  calculat- 
ing the  number  of  perceivable  levels. 

The  result,  presented  in  Eq.  (16),  may  be  employed  to  confute  the  total 
number  of  contrast  levels  that  can  be  perceived,  at  a given  retinal  frequency, 


\ 


without  exceeding  a specified  contrast  value.  Only  the  threshold -contrast- 
sensitivity  function  m^v)  end  the  parameter  k are  required  to  perform  the 
computation.  In  the  following,  we  shall  employ  the  result  for  m,p(v),  presented 
on  p.  95  of  TR2,  for  a 6.5°  display*  with  an  average  luminance  of  34  mL  and 
surround  brightness  of  3.4  mL.  These  viewing  conditions  were  selected  as  typ- 
ical of  a wide  variety  of  display  situations.  The  value  k = 0.15  will  be  used 
and  assumed  to  be  independent  of  frequency.  Although  data  for  Am(mo)  was 
presented  for  only  one  retinal  frequency  (3  cycles/degree-of-vision) , it  is 
reasonable  to  assume  that  k is  slowly  varying  with  frequency.  This  assumption 
implies  that  each  frequency-specific  channel  of  the  visual  system  is  endowed 
with  a detector  of  equal  sensitivity. 

As  a result  of  assuming  a frequency- independent  k,  the  confuted  con- 
trast sensitivity  1/Ara  as  a function  of  v should  become  relatively  independent 
of  frequency  as  the  initial  contrast  mQ  is  increased.  This  effect  is  shown 
in  Fig.  48.  As  mQ  is  increased,  the  low-frequency  roll-off  of  the  threshold- 
sensitivity  curve  rapidly  disappears,  and  the  curves  develop  a wide— band  region 
of  almost  constant  sensitivity.  (Naturally,  at  v » 0,  the  contrast  sensitivity 
must  always  vanish.)  The  computed  flattening  of  the  sensitivity  curves,  and 
hence  the  assunq>tion  of  a frequency- independent  k,  is  consistent  with  the  re- 
sults of  the  suprathreshold  contrast -matching  experiments  of  Georgeson  and 
Sullivan  [32] . These  authors  observed  that  the  relative  perceived  contrast  of 
pairs  of  gratings  of  different  retinal  frequencies  become  relatively  independ- 
ent of  frequency  as  the  grating  contrast  increased. 

2.  Number  of  Perceivable  Contrast  Levels 

To  compute  the  maximum  number  of  perceivable  contrast  levels  as  a function 
of  v,  we  set  =»  1 and  i - n(v)  - 1 in  Eq.  (16).  Solving  for  n(v),  we  have 

In  [1  + k/m^(v) ] 

u(v)  * 1 + In  (1  + k)  (17) 

where  k * 0.15  (100%  modulation) 

♦Rigorously,  mr(v)  is  independent  of  display  size.  We  have  found  (TR2,  p.  95) 
that,  for  display  sizes  subtending  more  than  about  6°  of  viewing  angle,  the 
measured  mx(v)  varies  slowly  with  display  size  over  the  frequency  range  of 
interest. 
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RETINAL  FREQUENCY  * (CYCLES/ DEGREE -OF- VISION} 


Figure  48.  Computed  contrast  sensitivity  l/Am  as  a function  of  retinal 
frequency  \>  for  various  starting  contrast  levels  The 

quantity  Am  Is  the  required  contrast  increase  for  a jnd  in 
contrast. 

Equation  (17)  and  the  measured  igj,(v)  were  used  to  confute  the  n(v)  indicated  by 
the  upper  curve  in  Fig.  49,  The  maximum  value  of  n(v)  is  approximately  76  and 

r\t 

is  achieved  at  the  frequency  v » 3 cycles /degree-of-vision,  corresponding  to 
the  peak  of  the  threshold-contrast-sensitivity  curve.  This  result  tells  us 
that,  in  principle,  we  need  not  employ  more  than  about  six  bits  of  picture  in- 
formation* for  the  assumed  viewing  conditions  (34-mL  display  luminance,  3.4-mL 
surround  luminance),  provided  the  levels  are  spaced  in  accord  with  Eq.  (16). 

*It  should  be  kept  in  mind  that  the  analysis  assumes  a constant  79. 4Z  correct 
response  in  the  forced-choice  contrast-discrimination  experiments.  If  one 
were  to  impose  a more  rigorous  constraint  on  the  detectability  of  the  con- 
trast levels,  computed  values  of  n(v)  would  naturally  be  larger. 
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DCPLAY  FREQUENCY  FOR  r/»*3  C v LINES) 


Figure  49.  Computed  number  of  perceivable  luminance  levels  as  a function 
of  retinal  frequency  v for  gratings  of  100Z  modulation  and 
for  the  luminance  power  spectrum  of  natural  scenes  at  the 
viewing  distance  r/w  ■ 3.  The  corresponding  display  frequen- 
cies at  r/w  = 3 are  Indicated  at  the  top  of  the  figure. 

Indeed,  as  will  be  shown  below,  the  required  number  of  contrast  levels  for  the 
statistically  averaged  picture  content  Is  significantly  less  than  even  this 
modest  figure. 

Before  computing  the  channel  capacity  of  the  display-observer  system,  we 
need  to  derive  an  expression  for  the  number  of  perceivable  contrast  levels  as 
a function  of  v for  the  case  where  the  value  of  the  contrast  at  any  frequency 
is  established  by  the  statistically  averaged  signal  power  spectrum.  Real 
scenes  do  not  consist  of  lOOZ-modulated  sine-waves  at  all  frequencies,  so  that 
Eq.  (17)  should  be  viewed  only  as  providing  an  upper  limit  to  the  number  of 
perceivable  contrast  levels.  A consistent  statistical  communication  theory 
approach  to  the  calculation  of  the  channel  capacity  requires  the  number  of 
levels  that  can  be  transmitted  through  a communication  channel  to  be  computed 
by  use  of  the  power  spectrum  corresponding  to  the  ensemble  of  possible  inputs 
to  the  channel.  The  power  spectrum  has  been  the  subject  of  considerable  study 
during  the  course  of  this  research,  first  in  TR1  and,  in  more  detail,  in  Sec- 
tion III  of  this  report.  We  have  found  that,  for  display  frequencies  greater 
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than  about  1 cycle/picture  width,  the  modulated  contribution  to  the  luminance 
power  spectrum  for  natural  scenes,  «L(ui),  can  be  represented  by  an  inverse- 
square-frequency  characteristic*: 

♦L(«o)  - 2u>Lfi2  - <T)2]/w2  (18) 


Here  wt  = 2ir/w  (w  is  the  picture  width)  is  the  lower  cutoff  frequency,  and 
-2  - 

I and  I are  the  mean  square  and  mean  luminance,  respectively.  In  order  to 

change  the  power  spectrum  represented  by  Eq.  (18)  into  an  equivalent  sine-wave 

contrast,  we  appeal  to  the  concept  of  frequency-specific  independent  channels 

[33,34]  in  the  human  visual  system.  We  imagine  that  each  channel  is  centered 

about  a retinal  frequency  v = wr/2n  and  has  a width  Av.  In  general,  the  power 

spectrum  <I>  (to)  encompasses  many  of  these  channels,  so  that  the  total  power  in 
L ** 

a single  channel  is  approximately  (1/tt)  x (2rA v/r)  x $ (u) . This  single- 

Lt 

channel  power  can  be  regarded,  for  the  purposes  of  detection,  as  equivalent 

to  that  of  a single  sine-wave  of  frequency  w whose  spectral  width  is  much  less 

than  the  channel  width.  In  other  words,  the  independent-channel  model  implies 

that  the  detector  for  a particular  channel  cannot  distinguish  between  the  in- 

1 - 2 

put  for  a single  sine -wave  whose  mean-square  luminance  is  (u>)  (I)  and  a 

continuum  of  frequencies  with  the  same  value  of  mean-square  luminance  contained 
within  the  channel.  Thus,  the  power  spectrum  given  in  Eq.  (18)  is  equivalent 
to  a sine-wave  contrast  m (u)  where*** 

eq 


im2  (oj)  (I) 2 = (2nAv/r)2mT  [I2  - (l)2]/™2 
2 eq  L 


(19) 


*The  form  of  4>l(<h),  given  in  Eq*  (18),  satisfies  the  condition  / ~ ^^(ui)  = 

-2  — 2 */«* cd 

I - (I)  if  the  low-frequency  behavior  of  ♦lC&j)  is  represented  by  a Lorentzian 
(cf.  TR2,  p.  46). 

**We  employ  the  Fourier  representation  in  which  the  spectral  power  per  unit  of 
(positive)  frequency  is  (1/it)#l  (<*■’)• 

***To  be  precise,  Eq.  (19)  may  break  down  at  frequencies  sufficiently  low  for 
the  channel  width  Av  to  become  of  the  order  cf  the  channel  frequency.  In 
practice  this  occurs  when  v ^ 1 cycle /degree— of— vision.  In  that  case,  the 
specific  form  of  the  response  of  each  channel  may  be  important;  the  integral 
of  the  power  spectrum  over  a single  channel  may  therefore  not  be  approxi- 
mated by  simply  multiplying  the  power  spectrum  evaluated  at  the  center  fre- 
quency of  the  channel  by  the  channel  width,  as  assumed  in  Eq.  (19).  Never- 
theless, for  the  sake  of  simplicity,  we  shall  apply  Eq.  (19)  at  all  retinal 
frequencies  greater  than  that  corresponding  to  the  lower  cutoff  frequency 
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Rearranging  Eq.  (19)  ana  changing  the  display  frequency  u to  the  retinal 
frequency  coordinate  v,  we  can  express  mg^  as 

mf  (v)  - (4/ir)  (r/w)Av  [I2/ (I)2  - l]/v2  (20) 

eq 

The  number  of  perceivable  contrast  levels  corresponding  to  the  luminance  power 
spectrum  for  natural  scenes  is  now  easily  obtained  from  Eq.  (16)  by  replacing 
m^  by  m^  and  setting  i * n(u)  - 1.  We  also  take  into  account  the  modulation 
transfer  characteristics  of  the  display  by  multiplying  the  power-spectrum 

o 

Eq.  (18)  by  |Re^j(w)|  , where  Rg££  is  the  effective  overall  MTF  of  the  display. 
We  thus  find,  for  n(v), 

a(v)  ^ x + In  {l  + (4k/ir)(r/w)Av[I2/(T)2  - 1]  |Reff  (2rv/r)  |2/v2n^(v) } _ 

In  (1  + k)  ~ (21) 

where  k » 0.15  (power  spectrum  of  natural  scenes) 

The  lower  curve  in  Fig.  49  represents  the  n(v)  computed  from  Eq.  (21) 
for  a viewing  distance  of  three  picture  widths.  In  the  calculation,  we  have 
assumed  a perfect  display  (R^^  “ 1) ; the  channel  width  was  taken  to  be  1 
c ycle /degree -of-vision,  the  value  indicated  by  recent  psychophysical  measure- 
ments  [33],  The  value  of  the  quantity  [1  /(I)  - 1]  was  set  at  1/6,  repre- 

sentative of  the  results  of  the  statistical  measurements  described  in  Section 
III.  A comparison  of  the  two  curves  in  Fig.  49  shows  that  the  effect  of  the 
l/u  power  spectrum  is  to  reduce  and  flatten  the  curve  of  n(v).  The  reduction 
is  simply  the  result  of  the  fact  that  the  statistically  averaged  contrast 
level  {Eq.  (20)]  is  significantly  less  than  1002.  The  flattening  is  due  to  the 
exact  compensation  of  the  low-frequency  linear  roll-off  of  the  threshold-con- 
trast sensitivity  {cf.  Section  V of  TR2]  by  the  inverse-square-frequency  de- 
pendence of  the  power  spectrum. 

Figure  49  indicates  that,  for  an  average  scene,  little  more  than  five 
bits  of  picture  information  are  required,  and  that,  at  the  viewing  distance 
considered,  negligible  picture  information  is  needed  beyond  a display  fre- 
quency of  400  TV-lines.*  The  effect  of  viewing  distance  is  shown  in  Fig.  50, 


*Display  frequency  in  TV-lines  is  defined  here  as  the  total  number  of  half- 
cycles on  the  display  screen:  Njy  * t»»w/ii. 


100  200  300  400  500  600  700  800  900  KXX>  1100  1200 

DISPLAY  FREQUENCY  (TV  LINES) 

Figure  50.  Computed  base-2  logarithm  of  the  number  of  perceivable 


luminance  levels  as  a function  of  display  frequency 
for  various  viewing  distances. 

where  log^n^)  Is  plotted  for  r/w  * 1,  2,  3,  k.  One  sees  that  the  maximum 
number  of  bits  remains  approximately  constant  at  about  five  bits.  However, 
as  one  would  expect,  the  display  bandwidth  requirements  depend  sensitively  on 
the  viewing  distance,  smaller  b&ndwidths  being  required  as  the  viewing  dis- 
tance increases. 


L. 


C.  THE  LUMINANCE  CHANNEL  CAPACITY 


1.  Formalism 


In  information  theory,  the  channel  capacity  is  defined  as  the  maximum 
number  of  units  of  binary  information  that  can  be  transmitted  through  a com- 
munication channel.  Numerous  treatments  of  linear  communications  systems 
exist  [35],  but,  as  was  seen  in  V.B.l  above,  human  perception  of  contrast 


m 
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levels  at  a given  retinal  frequency  Is  extremely  nonlinear.  In  what  follows, 
we  shall  take  Into  account  this  nonlinear  distribution  by  employing  the 
result  [£q.  (21)]  for  the  number  of  perceivable  contrast  levels  as  a function 
of  retinal  frequency  in  deriving  an  expression  for  the  channel  capacity  for 
luminance  signals. 

Before  proceeding,  we  should  like  to  reiterate  a point  made  previously 
(in  Section  11. A of  TR2)  regarding  the  meaning  of  an  information  theory 
descriptor  such  as  the  channel  capacity.  Since  the  channel  capacity  is  con- 
cerned only  with  the  transmission  of  structural  information,  there  is  no 
formal  explicit  constraint  that  would  require  the  transmitted  picture  to  be 
perceived  as  a faithful  rendition  of  the  original.  The  channel  capacity  should 
be  viewed  as  a measure  of  the  ability  of  the  channel  - consisting  of  the  dis- 
play and  the  observer  - to  transmit  structural  information,  measured  in  bits, 
from  the  input  scene  to  the  cognitive  level  of  the  brain.  In  practice,  one 
usually  expects  to  find  a one-to-one  relationship  between  the  channel  capacity 
and  a measure  of  the  faithfulness  of  reproduction  such  as  the  correlation 
fidelity  (cf.  Section  II  of  TR2).  However,  the  distinction  between  the  con- 
cepts of  faithfulness  of  reproduction  and  information  transmission  should  be 


kept  in  mind. 

In  order  to  compute  the  channel  capacity,  we  need  an  expression  for  the 

total  number  of  distinguishable  states  for  the  display-observer  system.  We 

consider  only  the  one-dimensional  case  here:  the  extension  to  two  dimensions 

by  means  of  the  method  of  Section  IV  of  TR2  is  trivial.  For  a given  display 

frequency  w,  there  are  n(w)  perceivable  contrast  levels.  We  choose  a set  of 

quantized  display  frequencies  in  a manner  that  will  allow  only  integer  multiples 

of  the  fundamental  angular  frequency  » 2h/w;  i.e.,  w * * jw^  (j  is  an 

integer).  Thus,  in  the  frequency  interval  between  w and  w,  + Au>,  where 

A.  i j J 

Aw  = 2irAj/w,  there  are  [n(Wj>]  J distinguishable  states  of  the  display-observer 
system,  provided  n(u)  is  slowly  varying  over  the  range  w^  <_  u>  “j+Aj  * Fot  a 

system  of  independent  frequencies,  the  total  nuaber  of  distinguishable  states 
then  is 


'IT  [“<“■}>] 


(wAw/2r) 


(22) 


(23) 


The  luminance  channel  capacity  K.  la  the  base-two  logarithm  of  W [36,3?]: 

“ u 

^ *£  (vAw/2i?)log2n(w.j) 


When  the  important  display  frequencies  are  much  larger  than  w , the  sum  in 

la 

Eq,  (23)  may  be  replaced  by  an  integral.  Proceeding  to  the  limit  of  an  in- 
finitesimal Au>,  Eq.  (23)  becomes 


log2  n (tii) 


(24) 


Equation  (21)  for  the  number  of  perceivable  levels  may  be  employed  in 
Eq.  (24)  for  the  case  of  noiseless  displays,  N(ui)  = 0.  However,  we  wish  to 
include  the  effect  of  display  noise  on  the  luminance  channel  capacity.  This 
task  is  easily  accomplished  by  referring  to  Eq.  (11),  where  it  is  seen  that 
display  noise  is  added  to  the  visual-noise  power  spectrum  in  a manner  that 
replaces  the  visual  noise  Nv(v)Av  by  the  quantity  [Nv(v)av  + N(2irv/r)2irAv/r] . 
Then,  employing  Eq.  (13),  which  links  N^(v)  to  the  observable  quantity  m^(v), 
we  find  that  Eq.  (21)  must  be  modified  so  that  nC(v)  is  replaced  by  the  quantity 
[ffl^(v)  t-  2k(2rAv/r)  x N(2ru/r)  |Rej^(2irv/r)  |^/(X)^].  Thus,  expressed  in  terms 
of  the  display  frequency  coordinate  w,  Eq.  (21)  for  n(u>)  becomes 


in 


n(w)  = 1 + 


(uL2rAv/r)  [I2/(I)2  - 1]  lReff(»)!2 
* ^[m^wr/Zr)  + 2k(21rAv/r)N(m)  |Rcf  f (m)  j2/ (I)2] 
in(l  + k) 


(25) 


where  w,  = 2it/w;  V.  m 0,15 


2.  Properties  of  the  Luminance  Channel  Capacity 

Equations  (24)  and  (25)  constitute  our  result  for  the  luminance  channel 

capacity  of  the  display-observer  system.  The  quantity  is  a function  of  the 

viewing  distance  and  depends  on  the  display  modulation  transfer  function  R (u>), 

6t  t 

the  display  signal-to-noise  ratio  (through  the  noise  power  spectrum  N(b>)),  and 

2 — 2 

the  statistics  of  the  input  scenes  (through  the  quantities  and  [I  /(I)  - 1]). 

As  a function  of  viewing  distance,  exhibits  the  same  general  characteristics 
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as  the  visual  capacity  [38];  it  rises  from  aero,  achieves  a broad  maximum  at  a 
display— dependent  viewing  distance,  and  falls  off  at  larger  viewing  distances. 

We  have  considered  the  simple  case  of  a noiseless  television  system  with  an 
ideal,  flat  passband  [Re^(uj)  = 1 for  |u>|  <_  and  * 0 for  |w|  > w,^]. 

We  find*  that,  for  a passband  corresponding  to  N,^  - u^w/w  “ 3^0  TV-lines,  the 
computed  msriaum  of  is  approximately  1400  bits  and  occurs  at  a viewing  dis- 
tance of  about  two  picture  heights  <3/2  picture  widths).  This  value  of  H 

la 

corresponds  to  an  average  of  nearly  five  bits  over  the  allowed  passband,  as 
one  would  expect  in  view  of  the  results  shown  above  (in  V.B.2;  cf.  Fig.  50). 

The  value  of  the  viewing  distance  for  maximum  is  smaller  by  a factor  of  2 
than  the  value  computed  for  maximum  visual  capacity  [38],  The  two  descriptors 
represent  two  entirely  different  perceptual  quantities,  sharpness  and  total 
information,  so  that  the  viewing  distance  that  optimizes  one  of  these  quantities 
need  not  optimize  the  other. 

The  dependence  of  on  display  bandwidth  at  a fixed  viewing  distance 
r/w  = 3 is  shown  in  the  uppermost  curve  of  Fig.  51.  Once  again,  for  the  sake 
of  simplicity,  we  have  assumed  an  ideal  low-pass  filter  for  Re^(m).  As  the 
bandwidth  is  increased,  first  rises  linearly  with  bandwidth  but  eventually 
bends  over  and  approaches  an  asymptotic  value.  The  computed  asymptotic  value 
of  at  this  viewing  distance  is  approximately  1590  bits.  As  can  be  seen 
from  Fig.  51,  an  Increase  in  the  display  bandwidth  beyond  about  400  TV-lines 
has  a negligible  effect  on 

The  effect  of  display  noise  on  H can  be  computed  by  means  of  Eq.  (25) 
for  n(w).  We  treat  the  case  of  a white-noise  spectrum. 


N(m)|Reff(W)|2  = for  [ 'u  j <_ 

- 0 for  |(u|  > 

where  N2  is  the  mean-square  noise-luminance  fluctuation, 
display  screen,  and  is  the  maximum  display  frequency. 


(26) 

as  measured  on  the 
Substituting  Eq.  (26) 


t 


*As  in  the  confutations  of  Section  V.B,  we  enployed  the  measured  m^v)  for 
34-mL  display  brightness  and  3.4-nsL  surround  brightness.  The  channel  width 

2 - 2 

Av  was  taken  to  be  1 cycle /degree-of-vis ion,  and  the  quantity  [I  /(I)  - 1] 

was  set  equal  to  1/6. 
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Figure  51.  Computed  channel  capacities  of  the  diBplay-observer 

system  as  a function  of  display  bandwidth  at  the  viewing 
stance  r/w  - 3.  Curves  are  shown  for  the  one  luminance 
and  two  chrominance  channels,  Red/Blue-Green  and  Yellow/Blue. 

into  Eq.  (25),  we  see  that  display  noise  becomes  an  important  factor  when  the 
following  approximate  equality  holds: 

m^(«r/2Tr)  = 2irk  Av  Ns 

-^2  (27) 

Next,  we  define  the  display  signal-to -noise  ratio  S/N  as  the  ratio  of  the  mean 
luminance  I to  the  rms  noise  fluctuation  Then,  taking  k = 0.15,  Av  = 1 

cycle/degree-of-vision,  Eq.  (27)  becomes 

2 

^(urttir)  = 54  [ (ai^r/2n)  (S/N)2]-1  (28) 

This  equation  predicts  that,  for  the  exa^le  of  a 300  TV-line  display  at  a view- 
ing distance  of  three  picture  widths,  display  noise  becomes  important  when 
S/N  = 0.35  ntj.  (v).  Now,  the  experimental  values  of  mT“1(v)  average  about 
1/0.004  = 250  over  the  allowed  frequency  range.  Thus,  we  expect  the  effect 
of  display  noise  to  become  significant  when  S/N  Z 90,  a value  in  agreement 
with  practical  display  experience. 
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Figure  52.  Computed  luminance  channel  capacity  \ aB  a function  of 

display  signal-to-noise  (S/N)  ratio  at  the  viewing  distance 
r/w  - 3.  The  display  passband  Is  represented  by  an  ideal 
low-pass  filter  with  a bandwidth  of  300  TV-lines.  The 
power  spectrum  of  the  noise  was  assumed  to  be  white.  The 
value  of  Hl  for  S/N  * 50  is  Indicated  in  the  figure. 

The  result  of  a detailed  calculation  of  H as  a function  of  S/N  for  our 

L 

hypothetical  300  TV-line  display,  viewed  at  r/w  = 3,  is  shown  in  Fig.  52.  It 
is  seen  that  the  computed  is  reduced  by  only  10Z  from  its  nolsefree  value 
when  S/N  * 100.  However,  for  values  of  S/N  below  about  50,  falls  off  rap- 
idly. This  result  is  illustrated  in  more  practical  terms  in  Fig.  53.  There, 
we  have  plotted  the  equivalent  noisefree  bandwidth  N^  against  S/N.  The  quan- 
tity N^  represents  the  bandwidth  necessary  for  a noisefree  display  to  match 
the  luminance  channel  capacity  of  a 300  TV-line  display  with  the  appropriate 
value  of  S/N.  For  example,  a 300  TV-line  display  with  S/N  = 100  is  equivalent, 
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Figure  53. 


ing  distance  r/w  * 3.  The  display  passband  Is  represented 
by  an  Ideal  low-pass  filter  with  bandwidth  of  N^.  The  power 
spectrum  of  the  noise  was  assumed  to  be  white.  The  value  = 
300  is  indicated  in  the  figure  for  S/N  » ». 


in  luminance  channel  capacity,  to  a 256  TV-line  noisefree  display.  Using  the 
results  of  Section  IV;  we  estimate  that  the  256  TV-line  display  is,  perceptually, 
less  than  2 jnd*s  from  the  300  TV-line  display.  Thus,  a display  signal-to- 
noise  ratio  of  100  is  very  good  indeed  in  this  case.  However,  for  values  of 
S/N  much  below  100,  decreases  rapidly.  At  S/N  — 50,  the  equivalent  band- 
width is  below  200  TV-line 8,  as  a result  of  which  the  display  has  been  serious- 
ly degraded. 

It  should  be  borne  in  mind  that  the  numerical  values  of  H are  derived 

la 

from  a specific  input  power  spectrum:  that  for  natural  pictorial  scenes. 
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Naturally,  the  values  of  and  conclusions  derived  from  them  may  change  marked- 
ly if  other  power  spectra  are  employed.  For  example,  the  input  from  alphanu- 
meric displays  with  random  character  positions  may  be  characterized  by  a white- 
noise  power  spectrum  for  frequencies  below  roughly  the  inverse  of  the  char- 
acteristic width  of  an  alphanumeric  character,  a form  entirely  different  from 
that  for  natural  scenes  [Eq.  (18)}.  For  such  an  ensemble  of  input  scenes, 

Eq.  (25)  for  n (to)  must  be  modified  by  replacing  the  power  spectrum  term 
2-22 

2w^[I  - (I)  ]/ w by  the  appropriate  white-noise  power  spectrum. 

Finally,  it  should  be  noted  that  is,  in  general,  a luminance-dependent 
quantity.  For  displays  operating  above  about  10  mL,  the  dependence  of  Qn 
luminance  is  negligible.  For  smaller  luminance  values,  it  is  possible  to  in- 
clude the  effect  of  luminance  on  in  a simple  way  by  letting  the  threshold- 
contrast-sensitivity  function  m^Cv)  vary  with  luminance.  We  assume  that  the 
visual  noise  Nv(v)  consists  of  the  sum  of  an  internal-noise  term  (v) , a 
property  of  the  visual  system,  and  a shot-noise  term  N^g (v)  that  arises  from 
the  finite  number  of  photons  arriving  at  the  retina.  These  contributions  can 
be  separated  by  examining  the  experimental  m^v)  as  a function  of  luminance. 

From  fundamental  statistical  considerations,  we  expect  N^.(v)  to  be  proportion- 
al to  the  square  of  the  average  luminance,  whereas  N (v)  should  be  proportion- 

v5>  2 

al  to  the  average  luminance  itself.  Then,  Eq.  (13)  for  m^v)  can  be  written  in 
the  form 


mj.(v)  - A(v)  + B(v)/I 


(29) 


where  A(v)  and  B(v)  are  independent  of  luminance.  The  functions  A(v)  and  B(v) 

can  be  determined  by  an  analysis  of  existing  Hj,(v)  data  at  various  values  of  the 

mean  luminance.  We  have  verified  the  form  of  Eq.  (29)  using  the  data  p restated 

in  Fig.  33  of  TR1.  As  expected,  the  contribution  of  the  last  term  ii.  Eq.  (29), 

due  to  photon  shot  noise,  waB  found  to  be  negligible  for  luminance  values  above 

2 

about  10  mL.  Thus,  the  function  A(v)  represents  the  mj,(v)  employed  in  the  numer- 
ical calculations  presented  here. 
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D.  THE  CHROMINANCE  CHANNEL  CAPACITY 


1 

i 


i-r 

■i 


t 


1.  Independent  Channel  Model 

In  order  to  include  the  effect  of  color  in  our  information  theory  approach 
to  image  descriptors,  we  must  utilize  a model  to  describe  the  processing  of 
chrominance  information  by  the  human  visual  system.  There  is  some  fairly  strong 
evidence  elucidating  the  probable  action  of  the  color  mechanism  in  humans.  It 
is  known  that,  at  the  retina,  light  Is  received  by  individual  "red,"  "green,"  and 
"blue"  color  receptors  (cones),  as  stipulated  by  the  Young-Helmholtz  theory. 
However,  the  output  of  these  receptors  appears  to  be  coded  as  the  sum  of  a lumi- 
nanci  signal,  which  adds  contributions  from  all  three  receptors,  and  two  color- 
difference  signals  [39,40]:  Red  vs  Green  and  Yellow  vs  Blue.  Thus,  the  total 

signal  is  transmitted  to  the  brain  in  three  independent  channels  - one  for  lumi- 
nance and  two  for  chrominance. 

The  model  of  three  independent  processing  channels  allows  us  to  formulate, 
in  a straightforward  way,  an  extension  of  the  theory  presented  in  Section  V.C. 
to  include  the  effect  of  chrominance  information.  We  sp.ek  an  expression  for  the 
total  channel  capacity  H of  the  display-observer  system.  To  obtain  the  total 
channel  capacity,  we  first  write  the  total  number  of  distinguishable  states  W 
of  the  display-observer  system  as 


U 


= u u u 
LWC1  C2 


(30) 


where  and  represent  the  contributions  of  the  two  chrominance  channels, 
and  W^  is  the  luminance  contribution  discussed  above  (Section  V.C).  The  simple 
product  form  of  Eq.  (30)  is  the  result  of  the  hypothesis  of  independent  lumi- 
nance and  chrominance  channels  in  the  visual  system.  If  the  channels  did  not 
carry  on  their  respective  processing  functions  independently,  mixing  between 
the  channels  would  occur,  and  the  resulting  expression  for  W would  be  far  more 
complicated  than  the  simple  form  of  Eq.  (30). 

As  a result  of  the  product  form  of  Eq.  (30),  the  total  channel  capacity 
is  simply  the  sum  of  contributions  from  the  three  independent  channels. 


H 


log2W 


«L  + *C1  + HC2 


(31) 
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The  contributions  and  of  the  two  chrominance  channels  to  the  total 
channel  capacity  are  simply 


KC1  “ log2WCl 
*C2  “ log2WC2 


(32) 


The  luminance  contribution  H was  previously  discussed  in  detail  (see  Section 

L 

V.C). 

The  technique  of  Section  V.C.l,  employed  for  the  computation  of  the  lumi- 
nance channel  capacity,  may  be  used  to  express  the  quantities  H . (i  = 1,2)  as 

wl 

integrals  over  spatial  frequency  of  the  base-2  logarithm  of  the  number  of  per- 
ceivable chrominance  levels  n^  at  a given  spatial  frequency: 

+“  . 

«C1  = w f 2r  log2  nCi^  where  i * 1.2  (33) 

The  meaning  of  the  quantities  is  as  follows:  For  the  luminance  case,  the 

quantity  n(<o),  appearing  in  Eq.  (24)  for  K,  represents  the  number  of  perceiv- 

la 

able  brightness  levels  for  a given  spatial  frequency  and  for  constant  chromi- 
nance. The  quantities  nci(u)  represent  the  number  of  perceivable  colors  for 
a given  spatial  frequancy  and  luminance  value,  along  the  Red/Green  and  Yellow/ 
Blue  axes.  Perceptually,  n(w)  and  nCi(u>)  appear  to  be  entirely  different 
quantities.  In  the  sense  of  information  theory,  however,  all  of  these  quan- 
tities appear  on  an  equal  footing. 

2.  Chrominance  Contrast-Sensitivity  Functions  and  Equivalent  rms 
Chrominance  Modulation 

It  now  remains  to  obtain  expressions  for  the  quantities  ncl(u>).  In  prin- 
ciple, we  should  require  the  same  kind  of  experimental  information  for  the 
chrominance  case  that  the  work  of  Nachmias  and  Sansbury  [31]  provided  for  the 
luminance  case.  That  is,  given  a chrominance  sine-wave  grating  of  given  modu- 
lation depth  (color  excursion)  and  spatial  frequency,  how  much  additional  modu- 
lation must  be  supplied  in  order  for  a perceivable  color  difference  to  be  de- 
tected? Unfortunately,  we  know  of  no  such  detailed  measurements.  However,  the 
experiments  of  MacAdam  [41]  do  give  considerable  information  regarding  the  dis- 
tribution of  chrominance  levels  in  the  low-frequency  limit.  These  experiments 
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indicate  that,  in  the  vicinity  of  white,  the  required  chrominance  change  for 
a jnd  in  color  varies  relatively  slowly  along  any  direction  in  the  CIE  chroma- 
ticity  diagram.  This  observation  indicates  that,  as  long  as  we  deal  only  with 
small  excursions  from  white  (relatively  desaturated  colors),  a model  based  on 
a linear  distribution  of  perceivable  chrominance  levels  with  fractional  color 
saturation  as  the  relevant  psychophysical  coordinate  may  be  appropriate.  The 
results  of  Section  III  show  that,  for  natural  scenes,  color  excursions  are' 
indeed  small  on  a statistical  basis.  Therefore,  in  order  to  cotqpute  the  chromi- 
nance contribution  to  the  total  channel  capacity,  we  shall  utilize  a linear  model 
for  the  distribution  of  perceived  chrominance  levels. 

We  proceed  by  analogy  to  the  luminance  case  and  postulate  two  threshold 
modulation  functions  for  chrominance  cTi(v)  [i  = 1,2].  The  functions  CT^(V) 
represent  the  required  fractional  saturation  modulation  for  an  observer  to 
perceive  a color  change  when  presented  with  a chrominance  sine-wave  grating 
centered  about  white.  Figures  54  and  55  reproduce  the  experimental  data  of 
van  der  Horst  and  Bouman  [42]  for  chrominance  gratings  along  the  respective 
Red/ Blue-Green  and  Yellow/Blue  axes  of  the  chromaticity  diagram.  The  modu- 
lation was  centered  about  the  equal-energy  white  point.  Also  shown  in  these 
figures  are  the  solid  curves  representing  simple  empirical  fits  to  the  experi- 
mental data.  At  a retinal  illumination  of  160  td  90-mL  source  brightness), 
the  approximate  expressions  for  the  where  v is  in  cycles/degree-of- 

vlsion,  are* 


CT,»/B1-GW  * °-0049I1  + W*-5>3J1/2  (M) 

CT,Y/B(V)  “ 0.0108[1  + (v/3)3]172 

Several  important  remarks  should  be  made  regarding  these  functions.  First,  it 
should  be  noted  that,  in  contrast  to  the  properties  of  the  luminance-sensitivity 
functions  [cf.  Fig.  48],  there  is  no  evidence  of  a low-frequency  roll-off  in 
the  chrominance-threshold-sensitivity  functions;  the  low-frequency  behavior  of 


*The  sensitivity  scale  for  the  is  defined  as  the  required  distance  in 

the  CIE  diagram  for  an  observer  to  perceive  a jnd  of  purity,  in  units  of  the 
fraction  of  the  distance  between  the  equal-energy  white  point  and  the  dominant 
wavelength.  The  dominant  wavelength  was  492  nm  for  Red/Blue-Green  modulation 
and  573  nm  for  Yellow/Slue  modulation. 
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R/B-G  MODULATION 
ABOUT  E- WHITE  POINT 

O 160  M 


RETINAL  FREQUENCY  » (CYCLES /DEGREE -OF- VISION) 

Figure  54.  Relative  threshold  sensitivity  for  chrominance  gratings  with 
amplitude  directed  along  the  Red/ Blue-Green  line  in  the  C1E 
chromaticlty  diagram.  Zero-amplitude  corresponds  to  the  equal- 
energy  (E)  white  point.  The  experimental  points  are  taken  from 
van  der  Horst  and  Bouman  [42].  The  value  of  the  retinal  illu- 
mination was  160  td.  A simple  empirical  fit  is  Indicated  by 
the  solid  curve. 

the  cf^(v)  is  flat.  Second,  the  maximum  sensitivity  of  the  Red/Blue-Green  axis 
is  larger  by  more  than  a factor  of  2 than  that  for  the  Yellow/Blue  axis.  Final- 
ly, the  frequencies  at  which  the  two  threshold-sensitivity  functions  roll  off 
differ  from  each  other  by  less  than  a factor  of  2,  but  neither  function  rolls 
off  faster  than  the  luminance-threshold  function  given  in  Fig.  48.*  These 
properties  should  be  borne  in  mind,  for  they  will  play  an  important  part  in 
understanding  the  results  presented  later  in  this  section. 


*It  is  likely  that,  above  about  20  cyles/degree— of-vision,  the  chrominance- thresh- 
old-sensitivity functions  roil  off  more  rapidly  than  would  be  predicted  by 
Eq.  (34). 


Y/B  MODULATION 
ABOUT  F.-WHTTE  POINT 

O t60M 
O 75  td 


RETMAL  FREQUENCY  v (CYCLES /DEGREE -OF- VISION) 

Figure  55,  Relative  threshold  sensitivity  for  chrominance  gratings  with 
amplitude  directed  along  the  Yellow/Blue  line  in  the  CIE 
chromaticity  diagram.  Zero-amplitude  corresponds  to  the  equal- 
energy  (E)  white  point.  The  experimental  points  are  taken  from 
van  der  Horst  and  Bouman  [42].  The  values  of  the  retinal  illu- 
mination were  160  and  75  td.  A simple  empirical  fit  is  indicated 
by  the  solid  curve. 


Next,  as  was  done  in  the  derivation  of  the  luminance  channel  capacity,  we 

represent  the  statistically  averaged  chrominance  power  spectrum  by  an  equivalent 

rms  chrominance  sine-wave  modulation  c . for  each  of  the  two  chrominance  chan- 

eq,i 

nels  i * 1,2  of  the  visual  system.  This  representation  rests  on  the  assumption 
of  the  existence  of  frequency-specific  independent  channels  for  the  processing 
of  chrominance  information,  just  as  is  observed  for  luminance  sine-waves  [34]. 
Whereas  there  is  considerable  experimental  evidence  regarding  the  existence 
of  such  channels  for  the  luminance  case,  we  know  of  no  experimental  work  that 
has  probed  for  indications  of  such  channels  in  chrominance  spatial-frequency 
processing.  Nevertheless,  we  shall  assume  that  such  channels  do  in  fact  exist, 
an  assumption  justified,  at  this  point,  only  by  indirect  experimental  evidence 
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fc 


f" 


t ; 
* 


£43}  and  by  the  physiological  and  psychophysical  similarities  of  luminance  and 
chrominance  processing  In  Che  visual  system. 

With  Che  assumption  of  frequency-specific  chrominance  channels,  the  deriv- 
ation of  an  expression  for  c^  ^ proceeds  in  exactly  the  same  manner  as  that 
which  led  to  Eq.  (20)  for  the  equivalent  contrast  m^  for  the  luminance  case. 

We  make  use  of  the  experimental  observation  (Section  III)  that  the  chrominance 
power  spectrum  can  be  represented  by  an  in verse- square- frequency  characteristic 

f\, 

above  a lower  cutoff  frequency  «*  2n/w  with  vanishing  average  chrominance. 

The  expression  for  c .is  directly  analogous  to  Eq.  (20): 

eq , l 


c2„  , = (4/ir)  (r/w)Av  . c2  /v2 
eq,  j.  Cl  m,l 


where  i = 1,2 


(35) 


Here  the  quantity  Av„,  represents  the  width  of  the  frequency-specific  channel 
2 

for  axis  i,  and  c^  ^ is  the  corresponding  mean-square  chrominance  modulation. 

We  shall  treat  only  the  case  of  displays  with  no  chrominance  noise.  Then, 
for  a linear  distribution  of  perceived  chrominance  levels,  the  number  of  levels 
is  given  by  the  expression 


flr.  = c , /cT,  + 1 where  i - 1,2  (36) 

Cl  eq,i  T1 

Just  as  in  the  luminance  case,  the  frequency  response  characteristics  of  the 
display  are  accounted  for  by  multiplying  the  right-hand  side  of  Eq,  (35)  for 

o q 

c by  |R  | , the  square  of  the  effective  overall  chrominance  MTF  of 

CII  jv»l 

the  display.  Then  substituting  Eq.  (35)  into  Eq.  (36)  and  expressing  the  re- 
sult in  terms  of  the  display  frequency  coordinate  w,  we  have 

nCi(u)  * t8VVCilReff,Ci|2/r]1/2V^i/(0CTi(ur/2,r)  + 1 
at  iii^  * 2if/w 


Equations  (33)  and  (37)  constitute  our  result  for  the  contribution  of  the 
chrominance  channels  to  the  total  channel  capacity. 

3.  Application  of  the  Chrominance  Channel  Capacity 

In  Fig.  56  is  shown  the  computed  number  of  perceivable  chrominance  levels 
n . as  a function  of  retinal  frequency  for  chrominance  sine-wave  gratings  of 

wi 
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R/B-G  CHANNEL- 
100%  MODULATION 


Y/B  CHANNEL- 
100%  MODULATION 


R/B-G  CHAIMEL- 
POWER  SPECTRUM 
v OF  NATURAL  SCENES 
>1  {r/w -3) 


Y/B  CHANNEL-  N 
^ POWER  SPECTRUM 


0.4  06  QB  I 2 « 6 8 » 

RETINAL  FREQUENCY  » (CYCLES/DEGREE -OF -VISION) 

Figure  56.  Computed  number  of  perceivable  chrominance  levels  as  a function 
of  retinal  frequency  v for  gratings  of  1001  modulation  (full 
saturation)  and  for  the  chrominance  power  spectrum  of  natural 
scenes  at  the  viewing  distance  r/w  - 3.  The  corresponding 
display  frequencies  at  r/w  * 3 are  indicated  at  the  top  of  the 
figure.  Curves  are  shown  for  the  Red/Blue-Green  and  Yellow/ 

Blue  chrominance  channels. 

1002  modulation  and  for  the  chrominance  power  spectrum  of  natural  scenes  at 
the  viewing  distance  r/w  * 3.  For  confutations  based  on  the  power  spectrum  of 


natural  scenes.  Eq.  (37)  was  employed;  a perfect  display  R 


eff.Ci 


1 was  as- 


sumed, and  the  width  of  the  frequency-specific  channels  was  taken  to  be 

1 cycle/degree-of-vision,  the  same  value  as  was  used  in  the  luminance  case  [33]. 
The  value  of  the  rms  chrominance  modulation  was  taken  to  be  0.30  for  the  Red/ 
Blue— Green  channel  and  0.10  for  the  Yellow/Blue  channel.  These  values  are 
typical  of  the  largest  measured  values  for  the  I and  Q axis,  respectively* 

V*  V 

*We  neglect  the  small  angular  rotation  ('v-  20°)  between  the  I(;,Qq  axes  in  the 
chroaaticity  diagram,  along  which  measurements  of  Section  III  were  taken,  and 
the  Red /Blue-Green  and  Yellow/Blue  axes,  for  which  the  perceptual  data  [42]  are 
available. 


(see  Section  III).  The  use  of  smaller  values  of  the  rms  chrominance  modulation 
will  reduce  the  computed  n^  accordingly. 

From  Fig.  56,  it  is  seen  that  the  Red/Blue-Green  channel  contributes  far 
more  perceivable  chrominance  levels  than  does  the  Yellow/Blue  channel.  This 
observation  la  particularly  true  for  the  curves  calculated  for  the  case  of  the 
power  spectrum  of  natural  scenes.  It  is  due  to  the  combination  of  the  higher 
inherent  sensitivity  of  the  Red/ Blue-Green  channel,  as  indicated  by  Eq.  (34), 
and  the  larger  observed  rms  chrominance  modulation  for  the  I axis,  as  discuss- 

v 

ed  in  Section  III. 

In  comparing  the  results  shown  in  Fig.  56  with  those  shown  in  Fig.  49  for 
the  luminance  case,  the  following  Important  differences  are  noted:  first,  at 

the  lowest  retinal  frequencies,  the  chrominance  channels  provide  a far  greater 
number  of  perceivable  levels  than  the  luminance  channel;  second,  for  the  power 
spectrum  case,  the  chrominance  levels  roll  off  rapidly  with  Increasing  retinal 
frequency,  whereas  the  luminance  levels  are  spread  over  a relatively  wide  band 
of  frequenclea.  This  fundamentally  different  behavior  is  due  to  the  absence  of 
a low-frequency  roll-off  in  the  chromaticity-sensitivlty  functions.  The  good 
low-frequency  sensitivity  of  the  cnromaticlty  channels,  combined  with  the 
inverse-square  frequency  power  spectra,  concentrates  most  of  the  available  per- 
ceivable levels  at  very  low  frequencies.  It  should  be  emphasized  that  this 
result  does  not  depend  on  the  relative  passband  widths  of  the  various  channels; 
that  is,  the  acuity  or  high-frequency  cutoff  of  the  sensitivity  functions  is 
not  involved  in  determining  this  fundamental  behavior. 

In  Fig.  51  are  shown  the  computed  chrominance  channel  capacities  as  a 
function  of  display  bandwidth.  Equations  (33)  and  (37)  were  employed  in  the 
calculations.  The  display  chrominance  passbands  were  represented  by  an  ideal 
low-pass  filter.  It  is  seen  from  this  figure  that  the  computed  rise  rap- 
idly at  first,  but  soon  bend  over  and  approach  an  asymptotic  value.  Indeed, 
the  contribution  of  the  Red/Blue-Green  channel  alone  is  larger  than  the  lumi- 
nance contribution  for  display  bandwidths  less  than  about  40  TV-lines.  The 
sum  of  the  contributions  of  the  two  chrominance  channels  is  larger  than  the 
luminance  contribution  for  display  bandwidths  below  about  140  TV-lines.  The 
sharp  initial  rise  of  the  chrominance  channel  capacities  is  due  to  the  con- 
centration of  the  perceivable  chrominance  levels  at  low  retinal  frequencies, 
as  discussed  above. 
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As  a practical  application  of  the  channel  capacity  formalism,  we  have  com- 
puted the  optimum  bandwidth  allocation  among  the  three  information  channels  - 
luminance  and  chrominance  - for  a display  whose  total  passband  is  constrained 
to  a particular  value.  In  the  example  considered,  the  sum  of  the  widths  of  the 
luminance  and  chrominance  passbands  was  constrained  to  the  value  of  450  TV-1 lues, 
and  the  viewing  distance  was  once  again  taken  to  be  three  picture  widths.  These 
values  were  chosen  because  they  approximate  the  bandwidth  and  typical  viewing 
conditions  for  U.S.  commercial  television.  We  sought  the  optimum  tradeoff, 
from  the  information  theory  point  of  view,  between  the  passbands  for  the  lumi- 
nance, Red/Blue-Green  and  Yellow/Blue  channels,  subject  to  the  constraint  that 
the  total  available  bandwidth*  is  450  TV-lines.  Mathematically,  the  total  chan- 
nel capacity  H,  given  in  Eq.  (31),  was  maximized  for  the  assumed  total  bandwidth 
constraint. 

The  result  of  the  calculation  is  given  in  the  first  two  entries  of  Table 
10.  It  is  seen  that  by  devoting  about  140  TV-lines  of  the  available  passband 
to  chrominance  signals,  the  total  channel  capacity  is  significantly  enhanced 
over  the  case  where  the  total  passband  is  devoted  to  luminance  signals  alone. 
This  result  is  in  very  good  agreement  with  the  actual  apportionment  utilized 
in  current  U.S.  color  receivers.  The  computed  5:1  ratio  of  the  passband  widths 
assigned  to  the  Red/Blue-Green  and  Yellow/Blue  channels  is  to  be  compared  with 
the  actual  3:1  ratio  for  the  bandwidths  associated  with  the  Ic  and  Qc  axes  of 

TABLE  10.  BANDWIDTHS  AND  CHANNEL  CAPACITIES  OF  DISPLAYS  AT  r/w  = 3 


Display 

Luminance 

Bandwidth  (TV-lines) 
Red/Blue-Green  Yellow/Blue 

V (bits) 

t 

Optimum  Color 

310 

118 

22 

1850 

450-TV-Line  Monochrome 

450 

0 

0 

1520 

Perfect  Color 

QO 

QO 

oo 

2700 

Perfect  Monochrome 

OO 

0 

0 

1600 

t Maximum  N for  total  passband  of  450  TV— lines. 


*In  the  example,  it  was  assumed  that  the  passband  is  continuous  and  cannot  be 
sampled  at  discrete  frequencies.  Thus,  interspersing  the  luminance  and  chromi- 
nance signals  is  not  possible,  as  it  is  in  the  case  of  color  television. 
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the  u.S.  color  (NTSC)  system.  The  computed  ratio  can  he  reduced  by  taking  into 
account  the  slight  misalignment  of  the  perceptual  chrominance  axes  and  the  I 
and  axes.  Considering  the  crude  nature  of  the  calculations,  the  overall 
agreement  with  the  empirically  established  color-television  passband  allocation 
is  considered  very  satisfactory. 

The  last  two  entries  in  Table  10  give  the  confuted  results  for  the  total 
channel  capacity  of  a "perfect”  color  display  (one  with  infinite  passbands  for 
all  three  channels)  and  a "perfect"  monochrome  display  (one  with  an  Infinite 
luminance  passband  but  no  color  capability).  Comparing  these  results  with  the 
computed  H-value  for  the  optimized  color  display,  it  is  seen  that  the  channel 
capacity  of  tfhe  optimized,  but  band-limited,  color  display  exceeds  that  of  the 
perfect  monochrome  display,  even  though  only  140  TV-lines  are  devcted  to  chromi- 
nance signals.  This  result  is  a testimony  to  the  effectiveness  of  color  as  a 
low-frequency  information  channel  and  arises,  as  discussed  earlier,  from  the 
excellent  low-frequency  color  sensitivity  of  the  visual  system. 

Table  10  indicates  that  significantly  greater  channel  capacity  can  be 
achieved  in  wlde-band  color  systems.  We  suspect  that  the  strength  of  this 
conclusion  is  somewhat  exaggerated  by  our  extension  of  the  chrominance-sensi- 
tivity functions  to  retinal  frequencies  above  those  actually  measured  [42], 
resulting  in  a probable  overestimate  of  the  values  of  for  large  chrominance 
bandwidths.  However,  we  consider  the  qualitative  nature  of  our  conclusion  to 
be  valid. 
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